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VIII — Cauchy Theory 


8 1. Integrals of holomorphic functions — § 2. Cauchy’s Integral 
Formulas — § 8. Some Applications of Cauchy’s Method 


In Chapter VII, § 4 we showed how a significant part of the classical theory 
of holomorphic or analytic functions on C can be obtained from Fourier series. 
In fact, our universal method for constructing them — a fundamental idea 
of Cauchy — is to integrate holomorphic functions along curves drawn in 
their domains of definition and thereby obtain a version of the “fundamental 
theorem of differential and integral calculus” (FT) for holomorphic functions, 
and then to deduce countless consequences. 

I will present only very few of them. The general theory of analytic func- 
tions is of unlimited scope! and the results needed in mathematical fields 
where holomorphic functions are encountered are, on the other hand, very 
limited in most cases. For instance, a famous result like Riemann’s theorem 
on conformal mapping of simply connected domains is rarely used, although it 
is recommended to be familiar with it for the sake of “general knowledge”; 
as for classifying simply connected Riemann surfaces, which would be far 
more useful, it would need far too complicated developments. The very basic 
results and methods that we will present in this chapter are, for example, 
quite sufficient for the chapter devoted to the theory of Riemann surfaces or 
for the one on elliptic and modular functions. 


' The two volumes by Reinhold Remmert, Funktionentheorie (Springer, 1995, also 
available in English edition), more than 700 very compact pages, can give some 
idea of the general theory of analytic functions, but do not cover Riemann sur- 
faces, elliptic and automorphic functions, differential equations in the complex 
domain, special functions, etc., areas that would require thousands of additional 
pages and that, at any rate, have been the subject of specialized presentations. 
Other numerous available presentations include Walter Rudin, Real and Complex 
Analysis (McGraw-Hill, 1966, also available in French), Jean Dieudonné, Calcul 
Infinitésimal (Hermann, 1968), in particular useful for its many exercises, Eber- 
hard Freitag & Rolf Busam, Funktionentheorie (Springer-Verlag, 1995), which 
lists several other books, Serge Lang, Complex Analysis (Springer, many edi- 
tions), John B. Conway, Functions of One Complex Variable (2 vol., Springer, 
1978-95), Carlos A. Berenstein & Roger Gray, Complex Variables. An Introduc- 
tion (Springer, 1991). 
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2 VIII — Cauchy Theory 


Learning to use basic ideas is, therefore, much better than learning a 
multitude of general theorems, albeit ingenious and deep, unless, of course, 
the aim is to specialize in general theory. 

Cauchy theory (it would be much better to say: Cauchy and Weierstrass) 
has been and continues to be the subject of many accounts merely differing 
from each other on detailed points of presentation or style; not seeing the 
need to reproduce them an umpteenth time, I have tried, whenever possi- 
ble, not to follow them, in particular regarding homotopy. As will be seen 
in the next chapter, apart from the residue theorem, another easy conse- 
quence, Cauchy’s method falls within the much more general framework of 
multivariate differential forms. 
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§ 1. Integrals of Holomorphic Functions 


1 — Preliminary Results 


(i) The fundamental theorem (FT) of differential and integral calculus (Chap. V, 
§ 3). In its simplest form, take a continuous function f on an interval J C R 
and, choosing an a € J, set 


F(x) = f " p(at 


a differentiable function such that F’(x) = f(a) for all x € I is thus obtained. 
Conversely, any primitive for f is given up to an additive constant by this 
formula. 

If we start with a regulated function? f — a less simple case —, the previous 
formula defines a continuous function F admitting right-hand and left-hand 
derivatives at every x € I, given by 
PEN) pe in Heed 

h h=0 
h>0 h>0 


F(x) = jim, 


and by a similar left-hand formula. In particular, the derivative F’(a) exists 
outside some countable set D of discontinuous points of f. Conversely, if 
there is regulated function f and a continuous function F’ in I which, outside 
some countable subset of J, admits a derivative equal to f(a), then, up to 
a constant, F’ is again given by the standard formula (Chap. V, §3, n° 13). 
F is then said to be a primitive for f. 

For simplicity’s sake, we will say that a function F is of class C'/? in I 
if it is a primitive for a regulated function which we will always write F’; it 
exists except possibly for a countable number of values of the variable: an 
unimportant ambiguity that can be removed by setting F’(x) = F4(z) for all 
x. For this it is not sufficient that F’ be differentiable outside some countable 
set. We adopt the notation C!/? because C° means that F is continuous, a 
less restrictive condition, whereas C! means that F’ exists everywhere and 
is continuous, a more restrictive one. 


? Recall that a function f defined on an interval I C R is said to be regulated if it 
satisfies the following three equivalent conditions: (a) it has both right and left 
limits at all points of J, ; (b) for any compact interval K C I and r > 0, K can 
be partitioned into intervals on which f is constant up to r; (c) there exists a 
sequence of step functions converging uniformly to f on every compact set kK C I 
(hence on I if I is compact). Chap. V, n° 7, Theorem 6. The Sum, product and 
quotient of two regulated functions are also of the same type. If f and g are of 
class C'/?, the product fg is a continuous function which admits a derivative 
outside some countable set, the regulated function f’(t)g(t) + f(t)g’ (6); fg is, 
therefore, a primitive for f’g + fg’. This allows us to apply the integration by 
parts formula to functions of class C'/? defined later. 
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(ii) Differential calculus in R?. Let U be an open subset of R? and f a 
map from U to R or R?. It is said to be differentiable at a point c € U if, 
h € R? being a varying vector, f(c +h) — f(c) is “approximately linear” in 
h for sufficiently small h; to be precise, there needs to be a linear map from 
R? to R or R?, the tangent map to f (or derivative, or differential of f) of f 
at c, written f’(c), such that 


fle+h) = fle) + fh + ofh) 


as the length |h| of the vector h tends to 0; hence 


(1.1) f'(oh c+ th) fort=0. 


d 
aE ( 
If c = (a,b), then 

flatu,b+v) = f(a,b) + put qu + o(ul + lol), 


where the coefficients p,q, elements of R or R? as the case may be, do not 
depend on u,v. These are the partial derivatives? 


q = Dof(c) = li f(a,b+v — f(a,b) 
=0 v 
of f at c. Hence, if h = (u,v) € R?, then 
(1.2) f(h = Dy flout Dof (ov. 


In the case of a function with values in R?, if we set f(x,y) = (fi(z, y), fo(z, 
y)), then for c = (a,b), the function f’(c), therefore, maps h = (u,v) to the vector 


(1.3) f'(Oh = Dif (chut+ Dof(c)v = 

= (Difi(c), Dife(c)) ut (Dafilc), Defelc))v = 

= (Difi(c)u+ Defi(c)v, Di fo(c)u + Do fa(e)vr) . 
Conversely, if the partial derivatives exist for all c € U and are continuous on 
U, in which case f is said to be of class C! in U, then f is differentiable at 


all points of U. If D, f and Dzf are also of class C1, f is said to be of class 
C?, and so on. We then have 


(1.4) D,Dof = D2Dyf . 


3 A notation such as D1 f(c) will always denote the value of the function Dif at 
C. 
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We often write 
df(ch) = f'(o)h 


instead of f’(c)h; for h = (u,v), df(c;h) = u clearly holds if f is the coordi- 
nate function (1,y)H @, df(ch) =v if fis(a,y)Hy and df(gh)=h 
if f is* the identity map z = (x,y) 4 (x,y). We can, therefore, write 


df |c; dz(x;h)| = Di f(c)dx(c; h) + Dof(c)dy(c;h) 
(1.5) df (c;dz) = Di f(c)dx + Do f (c)dy 


for short. 

We will also need the chain rule in two cases. 

(a) Suppose that f is of class Ct on U and let «: J —> U be a function 
defined on an interval J of R, whence a composite function p= fow:twH 
f {u()]. The same holds for the function p at each point t at which p is 
differentiable and 


(1.6) p(t) = f [u@)] u(t) 


is the image of the vector? j/(t) € R? under the linear map tangent to f at 
u(t). If f and p are of class C?, so is p; if f is of class C1 and p of class C!/?, 
the obviously continuous function p is of class C!/? since p/(t) exists outside 
some countable set and as the product of the continuous function f’ [u(t)] 
and the regulated function p’/(t) is regulated. The function p is, therefore, a 
primitive for f’ [j(t)] y’(t). 

(b) If g is a map from an open set V C R? to U, whence again a composite 
map p= fog: V —> R’, then, at each point c € V where g is differentiable, 
so is p, and 


(1.7) P(e) = f'lg(olog'(e) 
is the composite or product of linear maps tangent to g at c and to f at g(c). 


* Here the letter z represents the point with x,y coordinates in R? rather than 
the complex number z + iy. It is in the theory of holomorphic functions that it 
is essential to regard points in the plane as complex numbers. Having said that, 
using the letter z to represent a point in R? or in any other set is not forbidden. 
If p(t) = (w(t), 2(t)), we’(t) is the vector (14 (t), w5(t)). If we do not distinguish 
between a point or a vector (u,v) € R* and the complex number u + iv € C, 
the complex number p(t) becomes the usual derivative of the complex valued 
function p(t). However, this interpretation is not generally compatible with for- 
mula (6), since in the latter f’ [u(t)] is a linear map from R? to R? and not a 
mere complex number. It is only if f is holomorphic that that three derivatives 
occurring in (6) can be interpreted as complex numbers. The possibility of inter- 
preting elements of R? in these two different ways often leads to confusion that, 
for good reason, does not occur in R”, when n > 3. 


a 
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This result can easily be recovered by writing that 


p(ct+h) = flgle+h)] ~ flale) + 9h] 
~ fla] + f 9Olg’ Oh = ve) + fla) gh, 


but this is not a proof. The previous formula can also be written as 


(1.8) dp(z; dz) = df [g9(z); dg(z;dz)] : 


in the differential df(z;dz) of f, z and dz are replaced by g(z) and by the 
differential of g at z, as was already known by Leibniz. 

When the points of R? are identified with complex numbers, any function 
f(x,y) = (filz,y), fo(a, y)) with values in R? is identified with the complex 
valued function 


z—> filz) +ifel(z), 
the partial derivatives being then identified with the functions 
Dif=NifitiDife, Def = Defi +iDefe. 


In case (a), the composite function p(t) = fi [u(t)] + if [u(t)] is complex 
valued; setting y(t) = i (t) + tpa(t) and D = d/dt, 


p(t) = Di fi [u(t)] Dua (t) + Dofi [u(t)] Due(t) + 
+ i {Dy fe [u(t)] Dun (t) + De fa [u(t)] Dua(t)} = 


= {D, fi [m(t)] + tDy fo [w()]} Dua (t) + 
+ {D2 fi [u(t)] + tDo fo [u(t)]} Dua(t) ; 


the same formula 


p(t) = Dif [u(@)] Din (4) + Daf [u(t)] Dua(t) 


is, therefore, recovered, but this time, the derivatives are the usual complex 
valued derivatives of complex valued functions. In case (b), it is necessary to 
assume that two holomorphic functions f are g are being composed to obtain 
a simple formula; see below. 


(iii) Holomorphic functions. Let f be a complex valued function defined 
in the open subset U of C and suppose that as a map from U to R? it is 
differentiable at c € U. It, therefore, has a derivative f’(c) : R? —> R? which 
is linear over the field R. It may be C-linear, i.e. of the form h +> ah, where 
a € C is a constant (namely the value of the map for h = 1); this means that 
then 


f(c +h) = f(c) + ah + o(h) 
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as |h| tends to 0, a relation where, this time, c, h, a, f(c), etc. are com- 
plex numbers. The number a, which characterizes f’(c), is then given by the 
relation 


(1.9) a= lim ere ae 


where hf is made to approach 0 through non-zero complex numbers. f is then 
said to be differentiable in the complex sense at the point c of U and we write 
a = f'(c). Hence the notation f’(c) represents both a complex number and 
a linear map from R? to R?; this apparent ambiguity is due to the fact that, 
in R?, maps of the form ht+ ah, where a € C is a constant, are just C-linear 
maps; it is, therefore, natural to make no distinction between such a map and 
the coefficient a which it is determined by. This generalizes to maps from a 
field K into itself that are linear over K : these are the schoolboys’ functions 
LH ax. 

The function f is said to be holomorphic on U if the limit f’(z) exists 
for all z € U and is a continuous function of z (Chap. II, §3, n° 19); or 
equivalently if f is C1 as a function of (x,y) and satisfies Cauchy’s relation 


(1.10) Dif=-iDef (= f') 
(Chap. III, §5, n° 20), which conveys exactly the C-linearity of the differen- 
tial (2). 


There is a chain formula for holomorphic functions. It is formally identical 
to the one in the theory of functions of a real variable and is used in two cases. 
(a) Consider first an interval J C R, an open set U C C,amap pw: 1 —U 
and a function f defined and holomorphic on U, whence a composite map 
p:tt+> f [u(t)] from I to U. If yu is differentiable at a point t in J, so is p and 


(1.11) p(t) = f le@) u(t), 


where f’ denotes the function defined by the limit (9). This result generalizes 
immediately to the case of a composite function of the form p = f o where 


jt is a function of several real variables s1,...,5,): denoting by D; the partial 
differential operator with respect to s;, 
(1.11’) Dip(s1, peas Sp) _ f [Lu (s1, vey Sp)] Dip (81, cay Sp) ’ 


because in order to differentiate with respect to s;, the other variables are 
kept fixed, which reduces to (11). 

(b) If J is now replaced by an open subset V of C and yu by a function 
g:V —> U holomorphic on V, the composite map p from V to C is also 
holomorphic and 


(1.12) P(2) = fig) 9") 


for all ze V. 


8 VIII — Cauchy Theory 


Indeed, by (7), formula (12) is exact if the derivatives f’, g’ and p’ are 
interpreted as linear maps from R? to R? and the right hand side as the 
composition of functions f’ [g(z)] and g’(z). By assumption, these maps are 
C-linear. But the composition of two C-linear maps h +> ah and h+> bh on C 
gives a map h ++ abh; formula (12) can, therefore, be obtained by substituting 
the corresponding complex numbers to the maps p’(z), etc.. 

As shown in Chap. VII, saying that a function f is holomorphic on an 
open set G amounts to saying that it is analytic on G, i.e. that, for each a € G, 
it has a power series expansion f(z) = >>, Cn (z — a)” which converges and 
represents it on a disc centered at a and in fact on the largest disc centered 
at a contained in G; the power series which represents f in a neighbourhood 
of a is just its Taylor series 


f™=> (PO@-ag", 


n>0 


where, we remind the reader that 2!"] = z"/n!. The terms “holomorphic” 
and “analytic” are, therefore, synonymous. 

Nonetheless, all the results that we will prove in this chapter are based 
only on the initial definition of holomorphic functions, in other words, do 
not use their analyticity, a result that we will recover by Cauchy’s tra- 
ditional method. Hence, we will maintain the strict distinction between 
“holomorphic” functions and “analytic” functions until we again prove the 
equivalence of these two notions. 


2 — The Problem of Primitives 


(i) Local primitives of a holomorphic function. One of the basic problems in 
the theory of holomorphic functions is to find a function f holomorphic on 
an open subset U of C a primitive of f on U, i.e. a holomorphic function 
F such that F’ = f. If U is a disc centered at a, the problem always has a 
solution since a power series can be differentiated term by term (Chap. II, 
n° 19): 


(2.1) f(z) = So en(z — a)” => F(z) =c+ So en(z—a)"*/(n +1), 
n>0 n>0 


where c is an arbitrary constant. A proof which does not use analyticity 
and which generalizes to differential forms consists in observing that if f 
is holomorphic on the disc D : |z| < R and if F’ = f, then, by (11), the 
derivative of the function t ++ F (tz), defined at least on [0,1] for a given 
z € D, is F’(tz)z = f(tz)z; The FT then shows that, when F'(0) = 0, 


(2.2) Fe) = | f (tz)zdt 


for all z € D. 
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Conversely, if F' is defined on D using this formula, then F' is holomorphic 
and satisfies F’ = f. To see this, observe that the function of (t, 2, y) under 
the sign [ is C'. This allows us to differentiate under the sign { with respect 
to x or y; setting D, = d/dx and D = d/dt, omitting the limits of integration 
and noting that D,z = 1, we get 


since f is holomorphic, Dif = f’ and f'(tz)z is the derivative of f(tz) with 
respect to t; hence, integrating by parts, it follows that 
- f #02) f(tz)d 


and so D,F = f. If Di is now replaced by Dz = d/dy, the calculation 
remains the same except that Daf = if’. Then DoF = if; as a result, F’ 
satisfies Cauchy’s condition and F’(z) = D, F(z) = f(z), qed. 

This method applies more generally to any star G, i.e. which does not 
have any point a such that, for all z € G, the line segment [a, z] is contained 
in G: replace tz by a+ t(z — a) in (2). It is for example the case of an open 
convex subset of C—R, by choosing a on the negative real axis, etc. We will, 
however, find further down a less restrictive result regarding G. 

Let us return to the general case. The results obtained above mean that 
every holomorphic function f on an open set G has a primitive in the neigh- 
bourhood of each point of G; but, as already seen (Chapter IV, § 4) for 1/z 
and its pseudo-primitive Log z, this local result in no way implies the exis- 
tence of a global primitive, i.e. valid on all of G; we will return to this point 
later. 


DF a= f fiee)ae+ fd [f(e2) )] dt = jf flejae+ trea), 


(ii) Integration along a path. Admissible paths. Let f be a function defined 
and holomorphic on an open connected subset G of C, i.e. a domain, and 
suppose that f has a primitive F' in G. If we were on R, the FT 


(2.3) F(2)- F(a) = | 4(Qde = F'(2) = fla) 


where, despite the notation, z and the integration variable ¢ are reals, would 
allow us to calculate F up to an additive constant. But, at first sight, inte- 
grating from a point a € G to another point z € G is not well-defined on 
C. 
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Nevertheless, we know — this is (1.11) — that if u: J —> Gis a map from 
an interval IC R to G, i.e. a path® in G, then 


d 
(2.4) qe MO] = FeO] w= fle] oe") 
at all points where the derivative p(t) exists. If is of class Ct, if I = [u,v] 
and if (uw) = a, u(v) = z, the simplest version of FT, therefore, shows that 


(2.5) F@)- Fl) = f Futt)) nat 


since the right hand side of (4) is a continuous function of t; formula (2) is 
obtained for u(t) = tz. This result still holds if y is of class C!/? : formula (4) 
holds at all points where p has a derivative, hence outside some countable 
subset of J, and the function f [(¢)] u(t) is regulated; the continuous function 
F |p (t)] is, therefore, a primitive for the latter, and so (5) follows. A path of 
class C/? will also be said to be admissible. 


If we set ¢ = p(t), @ la Leibniz, then d¢ = p’(t)dt, so that on the right 
hand side of (5) we integrate the expression f(¢)d¢; this leads to define the 
integral of f along a path pw by 


(2.6) | f(Qac = | f [w(t] w! (at 


in the same way as in Chap. V, eq. (5.16) for Cauchy’s formula for a circle. 
The notation introduced on the left hand side of (6) could be justified by 
observing that choosing a subdivision of J = [u,v] by the points u = tp < 
ty <...<t, =v and setting ¢; = u(t;), integral (5) is approximately equal 
to 


dof G) Hw Gi) (tit — 8) 


and hence, by the mean value formula, no less approximately, to 
Sof (G) (G41 — Gi); whence notation (6). The reader will easily be able to 
add the € necessary to correct this simplified argument by subdividing I so 
that the functions considered are piecewise constant up to €; see point (iv) 
further down. 

Note that integral (6) does not depend exclusively on the “curve” p(J) 
defined by p(t) as t varies in J; indeed, the latter does not change if the map 


® Everyone uses the letter y to denote a path. I will use the letter j because 
(i) computer keyboard “designers” have had the good idea to include one and 
only one Greek letter, namely p, (ii), more seriously, as we will see a bit further 
down, the function p(t) occurs by way of the Radon (or Stieltjes) measure dyu(t) = 
uu’ (t)dt which it defines. 
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is replaced by v(t) = 4 (y(t)), where ¢ is a surjective map from an interval 
J to I; supposing, for simplicity’s sake, that ~ is of class C!, 


/ f(2)dz = i flu (v(t) a! eQ)] y'(tat 


by the chain rule; hence, setting F(t) = f [u(t)] u’(6), 


[rom= roa, [soe [ reoison, 


As I = (J), the equality of these two integrals seems to follow from the 
change of variables formula for integrals (Chapter V, §6, n° 19). But the 
latter concerns oriented integrals. Hence, the equality 


[ f(z)dz = i. f(z)dz 


supposes that y maps the initial (resp. terminal) point of J onto the initial 
(resp. terminal) point of J. Otherwise, the previous relation only holds up 
to sign. In practice, only strictly increasing “changes of parameter” y are 
considered. This avoids difficulties and reduces the question to paths for which 
I = [0,1]. From now on, we will suppose this to be the case, unless stated 
otherwise. 

Admissible paths or those of class C!/? cover all cases that might arise. In 
practice, J can almost always be divided into intervals on which the function 
p is C!, if not linear. But, once the notion of a primitive for a regulated 
function has been understood, using these “piecewise” paths of class C! 
(or linear), as they are called, is not any easier than using paths of class 
C'/2. Applying Theorem 12 bis of Chap. V, n° 13, we see that if a path is 
considered to be the trajectory of a moving object, then admissible paths can 
be characterized by imposing the following conditions upon them: 


(a) the map yp is continuous, 

(b) it has a right and left derivative at every point t, 

(c) these are equal outside some countable subset D of I (for example, it 
may be that D= INQ, but it is better to avoid this kind of paths in 
practical calculations. .. ), 

(d) The right (or left) derivative is a regulated function of t, i.e. has right- 
hand and left-hand limits for all t or, equivalently, is the uniform limit 
on I of step functions. 


The trajectory defined by pu(t), possibly passing over the same point sev- 
eral times, therefore, admits a “velocity vector” ju/(t) outside D as it can 
change direction (“angular points”) at points of D. It has a tangent at each 
point where p(t) exists and is non-zero. As any driver attempting to park 
between two cars knows, the case p/(t) = 0 can entail a cusp point.” 


” Example: t +> (¢?,t?) at ¢ = 0, with J = [-1, 1). 


12 VIII — Cauchy Theory 


Fig. 2.1. 


(iii) Integral along a path as a Stieltjes integral. It is sometimes convenient 
to interpret integral (6) as a Stieltjes integral (Chap. V, §9, n° 32) with 
respect to a Radon or a Stieltjes complex measure defined on I by the function 
u(t). In Chap. V, we only defined Stieltjes integrals over an interval J of R 
with respect to real increasing functions in order to obtain positive measures, 
but this method can easily be generalized to linear combinations with complex 
coefficients of increasing functions.® This is the case of every function C!/? yu 
since the standard formula p’ = Re(y’)* — Re(y’)~ +..., transforms p into a 
linear combination of increasing functions as they are primitives of positive 
functions. Complex Radon measures are thus obtained on J in the sense of 
Chap. V, §9, i.e. continuous linear functionals on the space C°(I) equipped 
with the norm of uniform convergence, at least if J is compact, which is the 
only case that interests us here. 

As all C'/? functions are continuous, formula (32.1) of Chap. V, § 9 defin- 
ing the measure of an interval J = (u,v) C J with respect to ys becomes 
u(J) = w(v)—p(u) regardless of the nature of J. Then the integral [ f(t)du(t) 
of a continuous or more generally of a regulated function f can be defined 
as the usual Riemann integral: the sum 5° f (tp) (Jp) is associated to ev- 
ery finite partition J = J; U...U J, of I into intervals, where tp € Jp; the 
integral [ f(t)dj(t) is the limit of these sums when the partition considered 
becomes finer. If the J, are chosen so that f is constant up to r on each Jp 
(characterization of regulated functions), then 


8 The classical terminology is functions of bounded variation. They are directly 
characterized as follows: there exists a positive finite constant M such that 


So |e (ti+1) — w(ti)| <M 


for all points ti < te < ... < tn of the interval considered. See for example 
Rudin, Chap. 6. 
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< r||ql| 


(2.7) | [ fOdutt) - OF) ep) 


since for all t, f(t) is, up to r, equal to the value at t of the step function 
equal to f (t,) on each J,. Recall that the notation |||| for the norm or total 
mass of the measure jt is the smallest positive number such that 


(2.8) | i Flean(e) < [lulllifll 


for all continuous and in fact regulated functions. If the measure p is positive, 
ie. if the function p(t) is real and increasing, then ||,.|| = (7). In fact, the 
main point of (8) is not the exact value of ||j1||; any constant independent of 
f will do. But see n° 4, (ii). 

To show that the curvilinear integral (6) is also a Stieltjes integral, first 
recall that in Chap. V we obtained a formula (32.15) which says that, for 
any real, increasing function p(t) of class C1 on J and for any continuous 
function f on J, 


(2.9) / f(t)du(t) = / f(t)u! (tat: 


it trivially generalizes to the case when ys is complex valued. In fact, for- 
mula (9) still holds when p(t) is C1/?. To see this, suppose that ju’ is positive, 
ie. that yw is increasing. As y is a primitive for py’, first of all 


w(J) = we) — (ai) = | yi (t)at 


for any interval J = (u,v) C I. Using as above a sufficiently fine partition of 
I, the regulated function y’ may be assumed to be constant up to r on each 
Jp. Hence, for all tp € Jp, 


(2.11) | (Jp) — H' (tp) m (Jp)| Sm (Jp) r, 

where m is the usual Lebesgue measure. Replacing each term pu (Jp) by 
LU (tp) m (Jp) in the Riemann sum 5° f (tp) 4 (Jp), the error made is less than 
IIfllz 22m (Jp) r = [lf llrm(Z)r. So 


[Hoan — SOF (tp) Hu! (tp) m Jp)| S HD + I fll), 


qed. 
In all cases, (6) can, therefore, be written as 


(2.10) / fae = | f [w(t)] duct) 


in line with Leibniz’s ideas. 
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(iv) A necessary and sufficient condition for a primitive. Let us return 
to a holomorphic function f on a domain G of C. If it admits a primitive F’ 
on G if a point a of G is chosen, then, as seen at the start of this n°, 


(2.12) F(2) — F(a) = / f(Oac 


for any admissible path yw connecting a to z in G. Hence, the integral of f 
along such a path depends only on its endpoints. 

In the general case, it may be tempting to construct a primitive by apply- 
ing the previous formula and choosing its value at a arbitrarily, for example 
F(a) = 0. But this definition of F’ is totally ambiguous: the value of the 
integral can very well depend on the choice of the path y connecting a to 
z in G as the case? of the function 1/z already shows. Hence, the notation 
F(z) used is, a priori, not well-defined; the only reasonable notation is to set 


(2.13) F(u) = | fOde 


for every integration path!° y. As in the case of the logarithm of a complex 
number z # 0 (Chap. IV, §4 or Chap. VI, n° 16), we get a set F(z) of 
possible values of the function sought, namely all the numbers obtained by 
integrating f along a path w connecting a to z in G or, in the case of the 
logarithm, all the numbers obtained by using a uniform branch of Log along 
a path connecting a fixed point a to the point z (as we will see, this amounts 
to integrating 1/¢ along this path). But as in the case of the logarithm, the 
problem is to construct a truly holomorphic function F', and in particular 
continuous, such that F(z) € F(z) for all z € G. In the case of the logarithm, 
we have seen this to to be possible if and only if the following condition is 
satisfied : if, for some given a € G and a varying path p : I —> G with initial 


° If it was independent of the path in this case, the integral of 1/¢ along the path 
t + exp(27it) connecting the point a = 1 to the point z = 1 would be equal 
to that obtained by integrating along the “constant” path t +> 1, i.e. to 0. 
However, the integral over the circle is obtained by integrating the function 277 
over [0, 1] and hence is not zero. Integrals in 1/z will be discussed in detail later. 
Mathematicians who invented the “calculus of variations” almost three centuries 
ago already had the idea of considering functions of a curve varying in the plane, 
on a surface or in space, a curve along which a given function is integrated; a 
century ago, the mathematician Vito Volterra used to call these line functions. 
For a “smooth” surface S in R®, we can, for example, try to find the curves 
of minimal length drawn on S connecting two given points: the geodesics; the 
length of a curve py is given by (4.8) and by comparing it to a curve “infinitely 
near” to it we get a differential equation characterizing the geodesics. Quite an 
old problem in mechanics consists in find a curve connecting A to B for two 
given points A and B such that the time taken to go from A to B by an object 
moving along it under the action of gravity is minimum. Fermat already knew 
that the trajectory of a light beam going from point A to a point B through a 
medium whose retractive index varies is the one that minimizes travel time. Etc. 
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point a in G, we consider the uniform branch t +> L(t) from Log z along 
which takes a given value in the set Loga at t = 0, then the value of this 
branch should only depend on the endpoint z = (1) of pu. 

As we will see, the problem of primitives has a similar solution. First, if f 
has a primitive F’ on G, integral (13) only depends on z. Conversely, suppose 
that for fixed a, the value of integral (13) is, for all z, independent of the 
path 4; we can then talk unequivocally of the function F(z). The function F’ 
is then a global primitive for f. 

Indeed, consider an arbitrary point b € G and let D C G be an open disc 
centered at b. Then, formula (2) adapted to the point b gives a primitive F’p 
of f on D, and Fp(z) — Fp(b) = f f(¢)d¢ where integration is along the line 
segment [b, z]. Since a constant can be added to Fp, we may assume that 
Fp(b) = F(b). To calculate F(z) at a point z € D, f must be integrated 
along an arbitrary path connecting a to z in G; for example, we can choose 
a path connecting a to b in G, then a path from b to z in D; by definition, 
integration along the arc connecting a to b gives F'(b), and, as seen above, 
the arc connecting b to z, for example the radius, gives F'p(z) — F’ip(b) = 
Fp(z) — F(b); adding, we find F(z) = Fp(z) on D. It follows that F is 
holomorphic and satisfies F’ = f on D, and hence globally on G since b is 
arbitrary. As a result: 


Theorem 1. A holomorphic function f on a domain G has primitive on G 
if and only if its integral along any admissible path in G depends only on the 
latter’s endpoints. 


In particular, the integral of f along a closed path, i.e. such that (0) = 
(1), is zero. In fact this condition is sufficient for ensuring the existence of 
a primitive. Indeed, if 

Hu, H2: [0,1] —G 


are two paths connecting a given point a to the same point z, we get a 
closed path [0,1] —> G by following first the path [0,1/2] —> G given by 
t+ y4(2t), then the path: [1/2,1] —> G given by t > po(2 — 2t); Clearly, 
the integral of f along the first path is equal to the integral along 44, and 
the integral along the second path is the opposite of the integral along the 
first one. The integral along the total path:!' [0,1] —> G is, therefore, the 
difference between the integrals along 4, and pg. As a result, these are equal 
for all 4; and pg. Hence the result follows from Theorem 1: A holomorphic 
function f on a domain G has primitive on G if and only if its integral along 
any admissible closed path in G is zero. 

For another proof of theorem 1, take an open disc D C G centered at z. 
To go from a to a point z+h € D, we can follow a path connecting a a z and 
then the radius [z,z+ hl], ie. the path try z+ th; then F(z +h) — F(z) is 


'! which may not be C+ even if that is the case of 41 and ps2. Hence it is necessary 
to include paths that are... piecewise admissible or at least C+. 
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clearly the integral along this radius. Hence F(z+h)—F(z) = f f(z+th)hdt, 
where, as usual, integration is over [0,1]. It follows that 


F(z+h)—F(z)—f(zh = f lete+e) — f(z) hdt; 


f being continuous, |f(z + th) — f(z)| < r for all t € I provided |h| < r’ 
(uniform continuity on a compact set); so 


F(z+h) = F(z)+ f(2)ht o(h) 
as h approaches 0, which proves the existence of F’(z) = f(z), qed. 


(v) The case of a contractible domain. A naive attempt at constructing 
a primitive without using theorem 1 would be to arbitrarily choose for every 
z€Ga path p, connecting a to z and to set F(z) = F (yz). Albeit strange, 
this leads to the result provided yu, depends on z in not too...arbitrary a 
manner. This, we will see, implies a drastic restriction on G. Formula (2) 
used in the case of a star domain clearly falls within this framework, but is 
based on an all too providential choice of pz. 

So let us assign to every z € G a path 


wz: t € [0,1] — pz (¢) 
connecting a to z= a +iy = (#,y) in G and set 
H(z,t) = p(t). 


So H(z,0) =a, H(z,1) = z for all z € G. To show that the function 


(214) F(2)=F (ju) = / f(Ode = | f [H(2,t)] DH(z, t).at, 


He 


where D = d/dt, is a primitive for f, it would suffice to show that, with 
respect to x and y, it has partial derivatives D, F and D2F equal to f and if 
respectively. For this, let us assume that differentiation is possible under the 
J sign without any difficulty— this would be miraculous if 44, was arbitrarily 
chosen — and calculate as Euler or Cauchy would have done; the calculation 
is similar to the one done for function (2) — only slightly harder. Using the 
product and chain rules for differentiation and the relations DD, = D,D, 
Dif = f', the FT gives!” 


2 Tn a notation such as DH (z,t).D,H(z,t), the point means that the operator D 
is applied to H(z,t) and not to the product H(z, t)DiH(z,t). 
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(2.15) D, F(z) = pm { f [H(z,t)] DH (z,t)} dt = 


= i { f' H(z, t)] Di(z,t).DH(z,t)+ 
+ f [H(z,t)] D, DH(z,t)} dt = 

= / { f’ [H(z,t)] DH (z,t).Di H(z, t)+ 
+ f (H(z,t)] DD, H(z, t)} dt = 

= peu [H(z,t)] DiH(z,t)} dt = f [H(z,1)] Di H(z, 1) — 
—f [H(z,0)] Di H(z, 0) 


But since H(z,0) = y4z(0) = a is independent of z and in particular of x, 
D,H(z,0) = 0; and since H(z,1) = p,(1) = z = x+y, D,A(z,1) = 1. So 
(15) becomes D, F(z) = f(z). If Di = d/dzx is replaced by Dp = d/dy, the 
calculation is similar except that Dof = if’ and D2H(z,1) = i; hence 
D2F(z) =if(z). The function F is, therefore, holomorphic and is a primitive 
for f. 

This is all formal calculation. To justify it, the theorem on differentiation 
under the f sign (Chap. V, n° 9, Theorem 9) needs to be applied. Leaving 
aside subtleties unnecessary for the time being, this supposes that the func- 
tion f [H(<z,t)] DH(z,t) integrated in (14) has continuous functions of the 
couple (z,t) € G x I as partial derivatives with respect to « and y . Since f 
does not present any problems,the derivatives of H(z,t) and DH(z,t) with 
respect to x and y must, therefore, exist and be continuous on G x I. The 
formula DD; = D;D has also been used; this is justified if H is of class C? 
on!’ G x I, in which case the previous conditions are obviously satisfied. 

Calculation (15) and the relation F’ = f are, therefore, justified, provided 
there is a map 


H:GxI—-7G 


'3 This is problematic since functions of class C” have only been defined on an 
open Cartesian space; however, J is compact and G is open, so that the product 
GxICcCxR=R’, a vertical cylinder having G as base and height 1, is neither 
open nor closed in R*. The solution is to constrain H to be C? on the open set 
Gx]0,1[ and H and its derivatives of at most second order to be the restrictions 
of functions defined and continuous on G x I to this set. Then derivatives at 
points of the form (z,0) or (z,1) are well-defined and the relation D; D = DD, 
which holds at (z,t) for 0 < t < 1, by passing to the limit, also holds for t = 0 
or 1. It would be simpler to assume that H is defined and of class C? on G x J, 
where J is an open interval containing J. In practice, this does not change the 
results in any way. 
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satisfying the following conditions : 


(i) H(z,0) =a, H(z,1) =< for all z € G, 
(ii) H is of class C? 


in the sense specified in the previous footnote. It is the case of the map 
(z,t) + tz for a star domain about the origin. 

The existence of a continuous, but not necessarily C?, map H from G x I 
to G satisfying (i) for some point a € G is expressed by saying that the 
domain G is contractible onto a. Setting H,(z) = H(1—t,z), we then get a 
one-parameter family of continuous maps H; indexed by t from G into itself 
starting with the identity map z +> z and, at the end of the process, taking G 
onto the point a; under the “contraction”, each z € G describes a trajectory 
t+» H(1 —t,z) which takes it from its initial position to the point a. It can 
be shown that if there is a contraction of class C° from G onto a point, then 
there also is one that is C? and even C™; while not being easy, it is not 
very difficult to prove. Hence, if we admit this point,'4 we get a more general 
result than that of n° 1 regarding star domains, but it will in its turn be 
generalized (?) further down: 


Theorem 2. Every holomorphic function defined on a contractible domain 
GCC has a primitive on G. 


Corollary. An annulus r < |z| < R is not contractible (which is physically 
obvious), since the function 1/z does not have primitives: its integral along 
the circle centered at 0 is obtained by integrating 277 over [0,1], and so is 
equal to 277 despite the closure of the integration path. However, C — R_ is 
contractible and even a star domain (consider the homotheties with centre 1), 
which explain why the function 1/z has a primitive on the domain, namely 
any uniform branch of the pseudo-function Log z. 


3 — Homotopy Invariance of Integrals 


(i) Homotopic paths. Computation (2.15) to differentiate under [ sign would 
continue to hold if the function H(x,y,t) was replaced by a function of mul- 
tiple real variables with values in G. The simplest case is that of a C? map,'° 


™ Tt is in fact unnecessary as theorem 2 is a consequence of theorem 3 which will 
be proved later. The relevance of theorem 2 as stated here only lie in its proof 
and, as such, is only a calculus exercise. 

For reasons stated in chapter 9 — similarities between curvilinear integrals (di- 
mension 1) and surface integrals (dimension 2) —, 0 may be called a 2 dimensional 
path in C ; the reader will easily generalize to all dimensions. There is no ortho- 
dox terminology; some, like Serge Lang, talk wrongly of a 2 dimensional simplex 
as in algebraic topology. Ours suggests that such a “path” takes us in a continu- 
ous manner from the usual path po : t +> o(0,t) to another one, pu : t+ o(1,t), 
just like an usual one-dimensional path takes us in a continuous manner from a 
point, a 0-dimensional path, to another one. 


1 


a 
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o:IxI—-G 


i.e. satisfying the following conditions : 


(a) @ is of class C? on the open interior of I x IC R?; 
(b) the partial derivatives of order < 2 of o can be extended by continuity’® 
to I x I. 


Such a map defines two families of C? paths on G, namely 


(3.1) fis :t + o(s,t) 
and 
(3.2) wy: 58H 0(s,t). 


As a is continuous, the family of paths 1, may be regarded as a “ deformation” 


of fo to py. The fact that a path can be deformed into another one in this 
way, or even by a merely continuous function o from I x I to G, is expressed 
by saying that the two paths considered are homotopic. A first useful case is 
that of a fired-endpoint homotopy of fo to pf ,: then suppose that 


tis(0) =o(s,0) and ps(1) = a(s,1) 


are independent of s. Another case occurs when, fp and /1; being closed, the 
intermediate paths ww, stay closed during the deformation: 


o(0,t) = o(1,t) for allt; 


Lo is said to be homotopic through closed paths to p11. 

Apart from these cases, the homotopy condition is always fulfilled (hence 
uninteresting) because, on the one hand, every path jy is homotopic to a 
“constant” path by o(s,t) = u[(1—s)t], and because, on the other, two 
“constant” paths are always homotopic as can be seen by connecting the 
former to the latter by a path in G and by shifting the former along it to 
take it onto the latter. 


Despite being somewhat abstract, fixed-endpoint homotopy can be inter- 
preted in an interesting way. Remark first that the set C°(Z) of all contin- 
uous paths J —> C,!" equipped with the norm |||; = sup |y(t)| and the 
obvious algebraic operations (addition, multiplication by a complex number), 
is a complete normed vector space (Cauchy’s criterion for uniform conver- 
gence), i.e. a Banach space (Chap. III, Appendix, no 5). Paths can, therefore, 
be defined in C°(J) as in any topological space: they are continuous maps 
h: [0,1] = 1 —> C°(J). Hence, for any s € I, h(s) = ps is a path in C, and 
setting o(s,t) = s(t), we get a map o from I x I to C; tH o(s,t) = s(t) is 
clearly continuous for all s. 


6 As I x I is compact, this means precisely that they are uniformly continuous on 
the open set ]0,1[x]0,1[: Chap. V, §2, n° 2, Corollary 2 of Theorem 8. 

17 Up to vocabulary, a continuous “path” is just a complex valued function defined 
and continuous on I. 
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Having said this, let us show that the map h : J —+ C°(J) is continuous if 
and only if so is the map 0: I x I —> C. If indeed the latter is continuous, it 
is so uniformly since I x I is compact (Chap. V, § 2, no 8); in particular, this 
means that for all r > 0, there exists r’ > 0 such that 


|s—s'| <r’ => |o(s,t) — o(s',t)| <r for allte]; 
but since h(s) € C°(J) is just the path t 4 o(s,t), this relation can be written 
(3.3) |s—s'| <r! => ||h(s) — A(s’)||, <r - 


Hence h is continuous. Proving the converse amounts to showing that (3) im- 
plies continuity of o(s,t) at all points (s,t) € I x I. To see this, let us start 
with the inequality 


|o (erat ) —a(s,t)| < |o (s’, t’) - a (s,t’)| + |o (a2) — a(s,t)| 


and choose some r > 0. If |s — s’| < r’, by (3), the first term on the right hand 
side is < r for all t’; but as the function t > o(s,t) is continuous for given s, for 
given (s,t), <r if |t—t’|, the second term on the right hand side is sufficiently 
small, qed. 

Let us now consider an open subset G of C and let C°(I, G) be the subset of 
in C°(Z) consisting of paths I —> G ; it is open in C°(I) since if 4 € C°(I,G), 
the image p (I) is a compact subset of G whose distance R to the border of G 
is strictly positive;'® It is then obvious that any path v : I —> C such that 
| — vl; < R remains a path in G that is in fact homotopic to py as the line 
segment 


o(s,t) = (1—s)u(t) + sv(t) 


connects pz and v in C°(I). On the other hand, it is obvious that for given 
a,b € G, the set Cc? .(G) of continuous points 1 —> G connecting a to b inG 
is a closed subset of the open set C°(I,G). The same holds for the set of closed 
paths in G. 

In conclusion, two paths with given endpoints a and b in G are fixed- 
endpoint homotopic if and only if they can be connected by a continuous path 
in the space C2.4(G) of all such paths. A similar result holds for a homotopy 
of closed paths. 


(ii) Differentiation with respect to a path. A norm can be defined in the 


vector space C/?(I) of admissible paths I —+ C by setting 


Neel] = [eel + Wee's 


equipped with it, C!/?(I) is complete. Indeed, if (un) is a Cauchy sequence, 
the functions j,(t) and p},(t) converge uniformly to some limits 4 and v that 


18 


are respectively continuous and regulated; the relation 


Let F be this border; the function d(z, F’) is continuous on the compact set ju(Z), 
and so reaches its minimum at some point a € p(/); were this minimum zero, 
there would be a sequence of points of F converging to a. This would imply that 
a € F since F is closed, a contradiction. 
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proving that p is a primitive for v is then obtained by passing to the limit. 
Besides, it is obvious that in the space C°(JI), the set C!/?(I;G) of pw € 
C'/2(T) such that (I) C G is open in C!/?(J). 

Let us now return to formula (2.13) 


(3.4) F(u) = if f(Qae = | f [u(t)] ul (eat 


defining a function on C'/?(J; G). It may seem strange to differentiate it with 
respect to ju, but as it it defined on the open subset C1/?(I; G) of the Banach 
space C!/?(T), definition (1.1) which holds in R? can be imitated. Here, c 
and h will be replaced by some p: € C!/?(I;G) and some v € C'/?(I). So the 
expression 


F(u+sv) = / f[w(t) + sv(t)] [w/(t) + sv" (t)] dt = 
= i flu(t) + sv(t)] du(t) + s ‘| f [w(t) + sv(t)] av(¢) 


must be differentiated with respect to s. For any v, it is well-defined for 
sufficiently small |s|. To differentiate under the [ sign [Chapter V, § 2, The- 
orem 9 or §9, formula (30.15)], it is sufficient to check that f [u(t) + sv(t)] 
is a continuous function of (s,¢), which is obvious, and that its derivative 
with respect s is a continuous function of (s,t); its existence is obvious — it 
is f’ [u(t) + sv(t)] v(t) — and so is its continuity since the functions f’, 4 and 
y are continuous. Hence in telegraphic style, 


[ierswaut f fut svdv+s f fut sv\vde = 
= fur svyvldu sav) + f fut svat. 


Integration by parts can justifiably be used to compute the last integral since 
the functions f(s + sv) and v are of class C!/?. So it can also be written 


t=1 
fut sve]. — ff qe sv) (ul + so!) vat. 

Since [p:’(t) + sv/(t)] dt = du(t) + sdv(t), the last integral can be written 
J f' (e+ sv)v(du + sdv), canceling out the first term of the penultimate 
formula. Hence finally, 


(3.5) F(wt sv) = full) + sv(1)] (1) ~ Fa(0) + 50(0)] (0) 
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All this obviously supposes that only values of s such that ptsv € C!/?(I;G) 
are used. The reader will probably be under the impression of having come 
across similar calculations above. A more or less vague form of this idea is 
due to Cauchy: in his work on the series expansion of f[u(t) + sv(t)] with 
respect to s, he calculated the coefficient of s. His calculations have long since 
disappeared from textbooks. This is a good reason for reintroducing them by 
rectifying his simplistic, but ultimately correct idea, since it reappears in a 
very generalized form in the version of the calculus of variations for example 
found in H. Cartan, Differential Calculus, where a function of the following 
form is differentiated with respect to p: 


b 
F(u) = / fit. u(t), u(t) at. 


On the other hand, note that in the presence of a formula such as (5), the 
idea of deducing an expression for F'(js+sv) by applying the FT immediately 
arises. This will be done a bit later. 

Exercise 1. Let po and pt be two paths on G anda :IxI—->Ga 
homotopy from fio to “1; write F(s) for the integral of the function f(z) 
along the path jz,. assuming o to be of class C?, find a formula similar to (5) 
for F’(s). 


(iii) Effects of a linear homotopy on an integral. We can now return to 
the behaviour of an integral when an integration path jig is deformed into 
a path yu without leaving the domain G where the function f to be inte- 
grated is defined and holomorphic. The difficulty is that, for 0 < s < 1, the 
intermediary paths 4, are continuous, but not necessarily admissible. We can 
get around it by altering the homotopy so that the jz, become admissible or, 
equivalently, by showing that it is possible to go from fo to 1 by a succession 
of linear homotopies between admissible paths, i.e. of the form 


(3.6) o(s,t) =(1—s)o(t) + sur(t) = us(t), 


where s,t € I = [0,1]. There is always such a homotopy when ju is sufficiently 
near [ig in the sense of uniform convergence: if R > 0 is the distance from 
}io(L) to the border of G, then o(s,t) € G for all s,t € I provided || ju; —Ho||1 < 
R. 

However, path (6) is of the form fg + sv with 


v(t) = p(t) — poo(t). 
It is, therefore, possible to apply (5) to the function F(z), whence 
d t=1 
(3.7) Sf #6 = Flna(#) bax) ~ nol) 
8 Jie t=0 
Suppose first that it is a fixed-endpoint homotopy. Then, jui(t) — po(t) = 0 
for t = 0 or 1, the derivative is zero for all s and the integral is, therefore, 
independent of s € [0,1]. In other words, in this case 
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(3.8) [ F(a = [ FO M. 


Suppose now that all paths ji, are closed, i.e. that js(1) = jzs(0) for all s. A 
short calculation shows that the right hand side of (7) is still zero for all s. 
Hence (8) follows again. 

Without using these assumptions, the FT applied to relation (7) shows 
that, by integrating over [0, 1], 


(3.9) fOde - if fOae = i f [ite(1)] [bx(1) — po(1)] ds — 


7 ‘k f [uts(0)] 2 (0) — 140(0)] as. 


Fig. 3.2. 


As p1(1) — fo(1) is the derivative of jz,(1) with respect to s, the first term is 
just the integral of f along the path s +> y,(1), the second being the integral 
of f along the path s+> y,(0). Therefore, the relation obtained means that, 
the integral of f along the coherently oriented closed path y drawn in figure 
2 above is zero. This would be obvious if f had a primitive on G, which is, 
however, not assumed. We will not generalize to any closed path: as a closed 
path, y is homotopic to the path consisting in traveling uo twice in opposite 
directions, and so is homotopic to a point. As for the fact that the integral 
of f along ¥ is zero, it follows from Theorem 3 which will be proved shortly. 


(iv) The homotopy invariance theorem. Paths ~uo and fy always being 
admissible, suppose only that the homotopy o deforming jug into py is C®; it 
is no longer possible to differentiate integrals or even to write them. But a 
given homotopy can be approached by linear homotopies and the preceding 
point can be used, which, as will be seen, again leads to the same results. 
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The image o(I x I) C G being compact, as already stated, its distance R 
from the border of G is > 0. Choose r < R. As o is uniformly continuous on 
the compact set J x I, there exists r’ > 0 such that 


(3.10) |s—s'|<r’ & |t-t| <r! = Jo(s,t)-a(s',t)| <r. 
For t = t’, in particular this shows that 
(3.11) ls— 8] <r! [le — pe'lly S73 


see remarks at the end of (i). 

Having said that, let us choose an integer n, give s values of the form 
Sp = p/n with 0 < p < n and let vy, be the path py, for s = p/n. It may 
not be admissible, but it can be approached by a piecewise linear, and hence 
admissible, path y, by choosing for its successive vertices the points of vp 
indexed by the parameter ty = q/n, i.e. the points apg = o(p/n,q/n). 


Fig. 3.3. 


If n is sufficiently large, the diameter of the square Ky, with vertices 
(Spi by)y (Spits bg )y (Seta « ty ta) (Spiastger) imix Tt is < 7’. Therefore, by (10), 
its image under a is contained in the disc Dp, centered at apq and of radius r. 
However, the latter is convex and contained in G since r < R. As a result : 


(a) the line segment connecting ap, to ap,q+1 is contained in G for all q, 
so that the same holds for the path yp, obtained by juxtaposing these 
segments for different values of q; 

(b) for any t € [tg,tq+1], the line segment connecting o (s,,t) to o (Sp41, t) 
is contained in G' since its endpoints are in Dpg. Hence, there is a linear 
deformation that takes us from 7, to yp41 without leaving G. 


The same arguments show that there are linear deformations taking us 
from pg to y, and from yn_1 to [y4. 


§ 1. Integrals of Holomorphic Functions 25 


Hence, if the definition of the yp, given for 0 < p < n is completed by 
setting yo = wo and yp, = jf, we get a sequence of admissible (and even 
piecewise linear except for the first and the last one) paths in G 


Yo = Fo; Vlsee es Yn = Bl 


such that there are linear deformations taking us from each of them to the 
next one without leaving G. If the deformation o we started with is a fixed- 
endpoint homotopy, the intermediate paths yp, clearly also have the same 
endpoints as the two given paths. As seen in (iii), a fixed-endpoint linear 
homotopy leaves the integral invariant. So the integrals along yp and yp+1 
are equal for all p. Similarly, if wo and p are closed and remain so during the 
given deformation a, yp is closed for all p and remains so during the linear 
deformation taking it onto y)+41. Hence, once again the integrals are equal. 
In conclusion : 


Theorem 3. Let G be a domain in C, f a holomorphic function on G and 
Lo, [41 two admissible paths in G. If one of the two following conditions is 
satisfied, then the integrals of f along uo and pi are equal: 


(a) there is a fixed-endpoint homotopy on G joining Uo and py 
(b) po and p11 are closed and homotopic in G as closed paths. 


Exercise 2 (direct proof of Theorem 3). We keep the above construction 
and notation by, for example, supposing that a is a fixed-endpoint homotopy; 
proving this result amounts to showing that integrals along 7, and Yp+1 
are equal for all p. For this, take a closed path yp», made of line segments 
connecting Apq, Ap,q+1, Ap+1,g+1, Ap+1,q and Gp, in the given order; using the 
existence of a primitive for f on D,,, show that the integral of f along this 
path is zero. Show that the difference between integrals of f along y, and 
Yp-+1 is equal to the sum, extended to q, of integrals along 7p, and conclude.!9 

Theorem 3 provides a new existence theorem for primitives. For this it 
suffices to suppose that condition (b) is satisfied for all zo and ju; in this 
case, any closed path p is indeed homotopic as a closed path to a “constant” 
path t> a,where a € G is arbitrarily chosen, so that the integral of f along 
pis zero. Then theorem 1, or its equivalent in terms of closed paths, shows 
that f has a primitive on G. 

In the next chapter it will be shown that conditions (a) and (b) are equiv- 
alent in a more general framework. Domains in which they are satisfied for 
all continuous closed paths fig and ju; are called simply connected. 


Corollary 1. Any holomorphic function on a simply connected domain G 
of C has a global primitive on G. 

Corollary 2. Let f be a holomorphic function on a simply connected do- 
main G; suppose that f does not vanish on G. Then, there is a holomorphic 


19 For similar proofs, see Dieudonné, Eléments d’analyse, vol. 1, (9.6.3) or Remmert, 
Funktionentheorie 2, Chap. 8, §1, n° 5 and 6. 
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function g on G such that e9%) = f(z) for all z € G; it is unique up to 
addition of a multiple of 271. 


Since f does not vanish on G, the function f’/f is defined and holomorphic 
on G, and hence admits a primitive g; then (e9)’ = g’e4 = e9 f’/f, i.e. (e9)' f— 
e9 f’ = 0, hence (e9/f)’ = 0, so that the function e% is proportional to f. 
Adding a constant to g, ef = f may be assumed. Any other holomorphic 
solution g; must satisfy the condition gi(z) — g(z) € 27iZ for all z, which 
obviously requires the left hand side to be a constant, qed. 

The relation e? = f means that g(z) € Log f(z) holds for all z € G, where 
Logw denotes the set z € C such that exp(z) = w (Chapter IV, §4). Such a 
function g is called a uniform branch of the pseudo-function Log f(z). Then 


g(z) = log |f(z)| +7. Arg f(z), 


where, at all points, the argument of f(z) must be chosen so that it is a 
continuous function of z. 

The definition of uniform branches of the no less pseudo-functions f(z)*, 
where s € C is given and is not an integer, follows from such branches g: these 
are the functions e*-9%), For s = 1 /p with p integer, we thus get holomorphic 
solutions of the equation h(z)? = f(z); they can be deduced from any one of 
them by taking its product with a p*” root of unity. 

All this supposes that G is simply connected. The case of the function 
f(z) = 1/z on G = C — {0} shows that this assumption is essential. In fact, 
Corollary 1 can be shown to characterize simply connected domains, but this 
result is rarely used. 


Corollary 3. Let f be a holomorphic functions on a domain G. The integral 
of f along any null-homotopic closed path, i.e. homotopic to a point in G, is 
zero. 


This result explains Theorem 2 whose proof used a homotopy of class C?, 
an assumption now unnecessary. 

Any contractible domain G is simply connected. Indeed, if o is a contrac- 
tion to a point a € G and pw a closed path in G, then yz is homotopic through 
closed paths to the constant path t + a under the map (s,t) + a [1 — s, u(t)]. 


This trivial result has a far less trivial converse: any simply connected 
domain G in C is not only contractible, but also homeomorphic to the unit 
dise |z| < 1; except when G = C, a case excluded by Liouville’s theorem 
on integral functions, even in G, there is a holomorphic function mapping G 
bijectively onto the open disc D : |z| < 1 and whose inverse is holomorphic 
(Riemann). Any holomorphic bijection f : U —> V from an open set onto 
another whose inverse g : V —> U is holomorphic is called a conformal rep- 
resentation of U on V. The existence of such a representation means that U 
and V are “isomorphic” from the point of view of analytic function theory: 
everything that holds for holomorphic or harmonic functions on U can be 
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immediately transferred to holomorphic or harmonic functions on V. As re- 
lation g [f(z)] = z shows that the derivatives of f and g are mutually inverse 
at corresponding points, it follows that f’(z) 4 0 for all z € G. Conversely, if 
this condition is satisfied, then f, though not necessarily a global homeomor- 
phism (for this f would need to be injective) transforms every open subset 
of U, and in particular U itself, into an open subset of C. This was proved 
in Chapter III, §5, n° 24 using the local inversion theorem: set f = p+ iq 
to be the map taking the point (z,y) € R? to the point (€,7) such that 
€+%in = f(x + iy) can also be written 


€=p(z,y), n=4(2,y), 


so that, by Cauchy’s equations, its Jacobian 


J p(t, y) = Dip(2, y)Doq(a, y) — Dop(z, y)Diq(2, y) 


is equal to 


Dip(z,y)? + Dig(2,y)? = |f'(2)/ 


and hence is non-zero. This proves the result. Moreover, if f : U — V = 
f(U) is assumed to be injective, then f is a homeomorphism [since the inverse 
image of an open set U’ C U under f~! is then f(U’), and so is open] and 
the local inversion theorem shows that, like f, the inverse map g : V —> U 
is of class C+ as a function of two real variables. It is holomorphic since the 
relation g[f(z)] = z implies that the Jacobian matrix of g at ¢ = f(z) is the 
inverse of that of f at z. Now, holomorphic functions are characterized by 
the fact that, at all points, their Jacobian matrix is of the form 


(4) 


It is, therefore, sufficient to check that the inverse of such a matrix is also of 
the same type. More simply: the inverse of any C-linear map is C-linear. All 
this has been proved in Chapter III, § 5, but it is worth recalling here. We will 
see in n° 5 (Theorem 7) that the relation f’(z) 4 0 is in fact a consequence 
of the injectivity of f, in other words that conformal representations are just 
bijective holomorphic maps. 

The fact that a simply connected domain G other than C is isomorphic 
to the unit disc is one of the most famous results of Riemann; while being 
based on a method going far beyond the framework of holomorphic functions 
(PDE specialists’ “Dirichlet’s principle”), his proof was not really satisfac- 
tory. Simpler proofs have since then been found?° and the behaviour of a 
conformal representation f of G on the unit disc in the neighbourhood of the 
boundary of G has been widely studied; if, for example, G is bounded and 


20 See for instance Chap. 14 in Rudin and for examples Chap. X in Dieudonné, 
Analyse infinitésimale. 
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if its boundary consists of a finite number of simple arcs of curve, or if G is 
bounded and convex, then f can be extended to a homeomorphism from the 
closure G' of G onto the closed disc |z| < 1. 

If f is a conformal representation of G on the unit disc D, then any 
conformal representation of G on D is obviously of the form ho f, where 
h = go f~' is a conformal representation of D on itself, and conversely. This 
leads us to determine the conformal representations of D on D, which is 
much easier than proving Riemann’s theorem: these are precisely the maps 
given by 


(3.11) A(z) = (az+b)/ (bz +4) =¢ od ai—bb=1. 


Exercise 3. (i) Show that (11) is defined for |z| < 1. (ii) Show that ¢¢-1 = 
(zzZ—1)/|bz+a|? and deduce that h(D) C D. (iii) Observing that h~! is also 
of the form given in (11), show that h(D) = D. (iv) Let f be a conformal 
representation of D on D such that f(0) = 0; show that f’(0) # 0 and 
f(z) = zg(z) where g is holomorphic and verify that |g(z)| < |z|~' in D. 
(vi) Using the maximum principle, show that |g(z)| < 1/r for |z| <r <1 and 
deduce that |g(z)| < 1 in D (particular case of Schwarz’s lemma: Chap. VII, 
84, n° 15, cor. 3 of theorem 11). (vii) Show that if f(0) = 0, then |f(z)| < |z| 
and |f~1(z)| < |z|. Deduce that f(z) = az, where |a| = 1. (viii) Show that, 
for any conformal representation f of D on D, there is a function (11) such 
that ho f has 0 as fixed point. Deduce that f is of the form given in (11). 

Exercise 4. Let P be the half-plane Im(z) > 0. (i) Show that the map 
z+ (z—-1)/(z+7%) is a conformal representation of P on D. (ii) Deduce that 
the conformal representations of P on P are the maps 


(3.12) z+ (az+6)/(cz+d) with a,b,c,deR, ad—be=1. 


For Riemann, a simply connected domain was a domain partitioned into 
two disjoint ones by all “cuttings” — simple curve segments connecting two 
boundary points. For example this is clearly not the case of an annulus. The 
equivalence of these two definitions is intuitively obvious, but proving it is 
another matter... 

Another “obvious” idea can be justified, namely that a domain is simply 
connected if its complement does not have any compact, connected compo- 
nent, in other word if there are no “holes” in G. It is even possible to go 
much further?! and consider domains whose complements have a finite num- 
ber of compact connected components K;(1 <i <n). A first result that is 
also made “obvious” by figure 4 is that there are then closed paths pu; in G 
such that A; is in the “interior” of ju; and K; in“exterior” of yi; for all j F 4; 
further explanations for the meaning of these terms will be given a bit later 


(n° 4, (i)). 


21 See chapters 8 and 14 in Remmert, Funktionentheorie 2, in particular the his- 
torical statements about Riemann’s theorem in chapter 8, and especially vol. 2 
of Conway, where everything is proved. 
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Fig. 3.4. 


A second far less “obvious” result, but which trivially implies the first, 
is that a bounded domain with n holes has a conformal representation on 
the domain obtained by removing from an open disc, could be all of C, n 
adequately chosen pairwise disjoint compact discs and possibly reduced to 
a point. This generalization of Riemann’s theorem proved at the start of 
the century by Paul Koebe, presents enough difficulties that even Remmert 
only mentions it at the end of about 700 pages of general theorems on an- 
alytic functions. The number of holes can also be shown to be the same for 
two homeomorphic domains, but this is a very particular case of much more 
general theorems in algebraic topology. In fact, this entire subject is charac- 
terized by an amalgamation of methods from analytic function theory and 
topology that are sometimes difficult to separate out. Their generalization to 
functions of several complex variables gave rise to remarkable Franco-German 
discoveries after the war. In some sense, they are easier to understand than 
those from theory in one variable: as they are more general, they do not use 
“elementary” ad hoc arguments that hide the real reasons for these phenom- 
ena. 
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There are also close links between these theories and the problem of ap- 
proximating holomorphic functions on a given domain G by simple polyno- 
mial functions or, when this is not possible, by rational functions without 
poles in G. If for example G is simply connected, and only in this case, any 
holomorphic function f on G is the limit of a sequence of polynomials in z 
converging uniformly to f on every compact subset of G. This result and 
more general ones are due to Carl Runge?? (1885). 


22 In fact, Runge did not see that the approximation by rational functions led to 
the result in question in the case of a simply connected domain. On another note, 
Runge was interested in atomic spectra in the hope of finding simple formulas 
that would allow their frequencies to be computed, as had already been done by 
Balmer for the hydrogen atom. We now know that this amounts to calculating 
the corresponding eigenvalues of the Schrédinger operator. This problem is still 
too hard, the helium atom, and all the more the following ones, continuing to 
resist all exact solutions. After 1900, when Felix Klein created the first applied 
mathematics team in G6dttingen, one of his recruits would be Runge, the first 
leading expert of numerical analysis. He also recruited Ludwig Prandtl, who 
remained until 1945 the greatest German expert of aerodynamics, perhaps even 
the greatest world expert,while Prandtl’s first brilliant student, the Hungarian 
Theodor von Karman who emigrated to CalTech at the end of the 1920s, would 
play the same role in the USA until the end of the 1950s. See Paul A. Hanle, 
Bringing Aerodynamics to America (MIT Press, 1982). 
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4 — Integral Formula for a Circle 


(i) Integrals in 1/z. Integrals of functions 1/(z — a) occur everywhere in 
the theory of holomorphic functions and knowing how to compute them is 
important. The integration path yw : I —> G is obviously assumed not to pass 
through a. Supposing a = 0 for simplicity’s sake, the definition of [ dz/z 
reduces to the integral of p’(t)/u(t) over the interval J. Same as yp’, this 
function is regulated. So admits a primitive L(t) and, by the FT, the expected 
result will be the variation of L(t) between the endpoints of J. But if we set 
h(t) = exp [L(t)], then h’(t) = L’(t)h(t). As L’(t) = yw’ (t)/u(t), the derivative 
of the continuous function h(t)/j(t) is seen to be identically zero outside 
a countable set of values of t. Hence the function is constant. Adding an 
adequate constant to L, it may, therefore, be assumed that 


(4.1) exp [L(t)] = p(t) 


for all t € I. As L(t) is continuous, this means that, by definition, L(t) is a 
uniform branch of the pseudo function Log z along p in the sense of Chap. IV, 
84, (vii) and (viii). Recall again that for us the notation Log z does not 
describe a clearly determined complex number, but, on the contrary, the set 
of ¢ € C such that exp(¢) = z. 

As an aside, recall also (Chap. VII, end of n° 16) that a uniform branch of 
Log z on a domain G C C* is similarly a (genuine) holomorphic function L — 
continuity would suffice — defined on G and satisfying L(z) € Log z, i.e. 


(4.2) exp[L(z)] =z, 


for all z € G. Unlike what happens in the case of a path, such a branch 
does not always exist, in particular if G = C*. It does so if and only if for 
any path yu in G, the variation of a uniform branch of Log z along yp, or, 
equivalently, of the argument of z, only depends on the endpoints of the path 
considered. Verified by the means available in Chap. IV, §4, (ix), this result 
is just theorem 1 of §1 applied to 1/z. 

If, instead of integrating 1/z, we integrate 1/(z — a) for a point a not 
located on ps, the result would obviously be the same. In conclusion: 


Theorem 4. The integral of 1/(z—a) along an admissible path js in C — {a} 
is equal to the variation of a uniform branch of Log(z —a) along pL. 


Such a branch is of the form 
(4.3) L(t) = log |u(t) — a| + 7.A(t) 


where log is the elementary function defined on R*, and where, in its turn, t + 
A(t) is a uniform branch along y of the no less pseudo-function Arg(z — a), 
i.e. a continuous function such that 


32 VIII — Cauchy Theory 


(4.4) y(t) — a = |p(t) — a] .exp [i.A()] 


for all t. Assuming yp to be closed, the term log |u(t) — a] of (3) has the same 
values at ¢ = 0 and t = 1 since it only depends on s(t). Its variation along 
is, therefore, zero, so that, up to a factor i, that of L(1) — L(0) is equal to the 
variation A(1)— A(0) of the argument of p(t) — a. But as the various possible 
values of the argument of a complex number differ by multiples of 27, 


(4.5) A(1) — A(0) = 27. Ind, (a) , 


where Ind,,(a) is an integer called the index of a with respect to 1, unless it 
is the index of jz with respect to a. As explained at the end of Chap. IV, § 4, 
physically, it is the number of rotations carried out by the ray with initial 
point a and passing through p(t) as t varies from 0 to 1. It is a positive 
or negative number calculated by taking into account the direction of the 
rotations. This will be justified in Chap. X, no 3, (iii). Clearly, Ind,,(a) only 
depends on the homotopy class of ws in C — {a}. As a result, setting 


Supp(1) = #(Z) 
for the support of the path py, the next result follows: 
Corollary. For any closed path p in C and all a € C — Supp(1), 


(4.6) | — -| ee wat = 2ni. Ind, (a)z, . 


Note that the left hand side of (6) — and hence the right hand side — is a 
continuous function of a outside the compact set y(1) = Supp(j2), ie. outside 
the image of I under 4, since the function of the (t,a) integrated over [0, 1] 
is continuous (Chap. V, no 9, Theorem 9). From the fact that a +> Ind,,(a) 
has values in Z, it can be deduced that the index of a point a with respect 
to a closed path 1 only depends on the connected component of a in the open 
set C — Supp(s1). By (6), it clearly approaches 0 as |a| increases indefinitely. 
Hence it is zero in the unbounded connected component of C — Supp(j). The 
latter is unique because it at least contains the exterior of any disc D having 
Supp(4) as a subset, so that all other components are contained in D, and 
hence are bounded. 

We sometimes write Ext(), exterior of py, for the set of z € C — Supp(2) 
where Ind,(z) = 0 and Int(y), interior of yw, for the set of z where 
Ind,,(z) # 0. The exterior of 4 contains the non-compact connected com- 
ponent of C — Supp(), but may be strictly larger. These notions are not at 
all related to those defined in Chap. III, no 1 with respect to an arbitrary 
subset of C, but refer instead to what has been said at the end of Chap. III, 
84 (Jordan curve theorem) in the case of a “simple” curve, i.e. homeomor- 
phic to the unit circle T. This supposedly simple case being already quite 
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Ext() 


Fig. 4.5. Freitag-Busam, p. 240 


subtle, it would be wrong to thing that the general case is less so even if, in 
practice, everything is always more or less obvious. 


(ii) Length of a path. It is often necessary to find an upper bound for an 
integral 


/ f(Qae = i fF [u(t)] wl (Bae. 
Setting 


(4.7) MF lla= ay If (HOI 


for the uniform norm of f along p, i.e. the uniform norm of f on the set 
of points Supp(w) = w(Z) of the “curve” described by ju(t) in the sense of 
Chap. III, no 7, 


| i Fico < Mtl f (Ola 


obviously follows. The integral occurring on the right hand side has a geo- 
metric interpretation. Indeed, if a subdivision u = tp < ty <...< ty =v of 
I = [u,v] is chosen to be sufficiently fine so that y’ is constant up to r > 073 
on each partial interval, then 

23 Recall our language conventions (Chapter III, no 2). A numerical function f is 


constant up to r on a set E if |f(a) — f(y)| <r for all z,y € E. An equality 
a=b holds up to r if |a—b| <r. 
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/ (dt =o |’ (él (ta — ts) upto mr, 


where m(J) is the usual length of 7. But setting ¢; = u(t;), by the FT, 


ti+l1 
G41 -G = i yi (t)dt 
ti 
and the right hand side of this relation is equal to p’(t;)(ti41 — ti) up to 
r(tj41 — t;). Hence the error made by writing 


[wold =Y len ai 


is bounded above by m(I)r + So(tiz1 — ti)r = 2m(I)r. Now, the left hand 
side of the previous relation is just the usual length of the piecewise linear 
path connecting the ¢;; the finer the subdivision considered, the closer this 
path is to w. Hence, the length of a path yw can reasonably be defined by the 
formula 


(4.8) m(1) = / lu! (Bat, 


where integration is over I: the scalar (not the vectorial) velocity of the 

moving object along a path is integrated with respect to time. The letter m 

suggests an analogy with the usual length or measure of an interval in R. 
The conclusion of these arguments is the inequality 


(4.9) 


[ Fae] < m(y)-llflla 


This result supersedes the almost trivial inequality in real variables we came 
across and is constantly used. 

Note that if 4(t) is replaced by v(t) = py [(t)], where y is a C! map from 
an interval J C R to the interval I where wp is defined, which does not change 
Supp(), then 


m(v) = f [e(t)] e(olat = fw [p(t] |e’ (@)| dt. 


Hence, the change of variable formula for integrals (Chap. V, §6, no 19) 
involving the function y’(t), but not its absolute value, gives the equality 
m(p) = m(v) only if the sign of y’ is constant, i.e. if y is monotone: it is 
generally accepted that the direct route from Paris to Marseille is shorter than 
the Paris-Lyon-Dijon-Lyon-Marseille one. Despite its terminology, the length 
of a path is, therefore, a kinematic rather that a geometric notion applicable 
to the set Supp() = w(Z). In fact, it would be better to call “route” what 
everyone calls “path”, but it is too late.?4 


4 Path: any road that can be taken to go from one place to another. Route : action 
of crossing space from one place to another. (Littré). 
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(iii) Cauchy’s integral formula for a circle. The calculations done in 
no 3, (iii) explain Cauchy’s integral formula for a circle (Chap. V, no 5), 
namely 


(4.10) 2rif (a) = ‘ LO ae, 

yS—a 
where a is in the interior of the circle w: t+ R.exp(27it) centered at 0 along 
which we integrate and where f is holomorphic on an open disc D of radius 
> R. The map 


(8,6) a+ (1— 8)(¢ — a) 


is indeed a contraction from D onto the point a. For given s, it is the ho- 
mothety with centre a and ratio 1 — s, which transforms p into a circle ps 
surrounding a and whose radius approaches 0 as s tends to 1: an example 
of a linear deformation. Hence, the right hand side of (10), where a holomor- 
phic function is integrated over the open set D — {a}, does not change if pu 
is replaced by ws, with s < 1, where this is a strict inequality. But for every 
r > 0, there exists r’ > 0 such that 


IC — al <r’ => |F(C) — F(@) — f’(@)(C-a)| < rl¢—al 


since f is differentiable at the point a. Therefore, if s is sufficiently near 1 
for ts to be contained in the disc |¢ — a] < r’, and if f(¢) is replaced by 
f(a) + f’(a)(¢ — a) in the integral along ys, for all ¢, the error made on the 
function f(¢)/(¢ — a) to be integrated is bounded above by r. So, in view 
of the standard upper bound (9), the error on the right hand side (10) is 
bounded above by m(ys)r, where m(5) is the length of y,. If it is assumed 
that for one complete circuit of a circumference, our scholarly definition of 
the length coincides with that of Archimedes — it is anyhow his since he 
approximated the circle with inscribed polygons, without, however, following 
it through...—, then it is clear that s ++ m(,) is bounded on J; in fact, 
m(Us) is the product of the length of the initial circle and of the homothety 
ratio 1 — s. Hence the error made on the right hand side of (10) has upper 
bound r up to a constant factor. In other words, 


an) f SO ac= nm, f [£9 + re] a= 
a 


Ls ¢—a 


= lim f(a) 


+f f'(a)d¢. 


The contribution of f’(a) to the second expression is zero since a constant 
function is being integrated along a closed path. It remains to evaluate the 
integral of 1/(¢ —a) along y,; for this, 4, may be replaced by a closed contour 
homotopic to it as a closed path in the open set C — {a} where the function 
1/(¢ — a) is holomorphic; for example, by a circle with centre a. By the 
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corollary to Theorem 4, this is the product of 277 and of the index of a with 
respect to such a circle, which is obviously equal to 1. In view of the factor 
f(a), the final result is, therefore, indeed 27i f(a). This is one of Cauchy’s 
own proofs. 


(iv) Modes of convergence of holomorphic functions. Recall that for- 
mula (10), an immediate consequence of the elementary theory of Fourier 
series, is the essential tool in the proof of Weierstrass’ theorem on limits of 
holomorphic functions (Chap. VII, § 4, no 19). Indeed, if f is holomorphic on 
the open set G, if K is a compact subset of G and if r > 0 is strictly less 
than the distance from K to the boundary of G, then, taking for py the circle 
\¢ — w| = r, (9) may be applied for all w € K. Set ¢ = w+ re(t), whence 
d¢ = 27ire(t)dt and 


FQ = >of w)r"e()*/n!, 


n>0 
Then, 
(4.12) f™(w)/n! = / f (w+re(t)) r-"e(t)- "dt, 
where integration is over [0,1]. Hence setting K(r) to be the compact set 
contained in G, consisting of points at a distance < r from Kk, and maximizing 


of the left hand side over K, we get 


(4.13) | i 


ae mir" | flla@ >» 


which proves that, in the vector space of holomorphic functions on G, the map 
fre f™ is continuous with respect to the topology of compact convergence.” 
Indeed, if a sequence (f,) converges uniformly to a limit f on every compact 
subset of G, the same holds for the successive derivatives of f in the real 
sense, which is identical up to constant factors to the complex sense. The 
limit function is, therefore, C° and satisfies Cauchy’s condition by passing 
to the limit. Its limit is, therefore, holomorphic, and as the derivatives in 
the real sense converge uniformly to those of f on every compact subset, the 
expected result follows (Chap. VII, no 19, Weierstrass’ theorem). 

In fact, Weierstrass’ conclusion can be reached by making seemingly much 
weaker assumptions than compact convergence on the convergence of the 
sequence (f,,). Let us in particular consider L?(1 < p < +00) convergence in 
the theory of integration. It is defined by the norm 


(4.14) ise (ff re)Pam(s)) | 


25 Recall (Chapter III, Appendix, no 8) that it is defined by the seminorms f +> 
|fllx, where K C G is an arbitrary compact set. The inequality obtained shows 


that if f converges uniformly on K(r), then f (~) converges uniformly on K. 
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where integration is over the given open set G?° with respect to the usual 
measure dm(z) = dady. We need to show that for holomorphic functions on 
G, the relation 


(4.15) lim ||f — fall, =9 implies lim f,(z) = f(z) 
uniformly on any compact subset K of G, i.e. that there is an upper bound 
(4.16) Ilflla < Mx\lfllp 


valid for any compact subset K C G and any function f holomorphic on G. 
In fact, since under these conditions, for all r, lim fr, Nz ) = f(z) uniformly 
on every compact subset, we also get upper bounds 


(4.16) | f 


< Mey 
5 Mecllfll 


for allr EN. 
To prove (16), let us again consider the compact set K(r) C G. For all 
a € K, the disc D(a,r) is contained in K(r) and 


-| f [a+ re(t)] dt 
) 


If we assume that in polar coordinates « = pcos27t, y = psin2zt, the 
measure dm(z) = dady is given by dady = 27pdpdt, then 


II... f(z)dady = 2x - oa | f [a+ pe(#)] dt = mr? f(a). 


Since mr? is the area of the disc D(a,r), this means that the value of a 
holomorphic function at the centre of a disc is equal to its mean value over 
the disc. As D(a,r) C K(r) for allac K, 


(4.17) lfc < f Lo! Aldm(2) < f I = IIfll, 


which proves (16) for p = 1. For 1 < p < +00, applying Hoélder’s inequality 
(Cauchy-Schwarz for p = 2) to the functions f and 1 on K(r), we get 


mrllfllx < (Mf. i 2am) (If an a) 


26 To define integral (14), which pertains to a positive continuous function, the 
method holding for lower semicontinuous functions must be applied (Chap. V, § 9, 
no 33, theorem 31): consider functions < |f(z)|? in R?, everywhere continuous 
positive on G and zero outside compact subsets of G. The integral of | f(z)|? is 
then the supremum of the integrals of these functions. Considering the supremum 
of extended integrals of |f(z)|? over compact sets K C G would amount to the 
same. This presupposes that we know how to integrate over an arbitrary compact 
set (same reference). 
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where 1/p+1/q = 1 (Chap. V, §3, no 14); the first integral is less that 
\|f|l> and the second does not depend on f, hence (16). (16’) can be similarly 
obtained from (12). 

The reader will easily verify that relation (16) continues to hold if, instead 
of the usual measure dxdy, we use a measure dui(z) = p(z)dm(z) to define 
fll, ice. the formula 


(4.18) ise= (ff 1(2)Po(2)dedy) | 


where the given “density” p is continuous and has strictly positive values. It 
suffices to note that the minimum m over K(r) of p is > 0. Hence 


Pas 4 HEME) § 5 Hg HEM AGIA 


for any function f. 
From this result, it follows that, for any measure du(z) of the form (18), 
the normed vector space H?(G, j) of holomorphic functions such that 


‘ Lf(2)Paul2) < +00 


is complete:2” Indeed, inequality (16) transforms every Cauchy sequence with 
respect to the norm L? into a Cauchy sequence with respect to compact 
convergence. Hence we get a holomorphic limit which can easily be inferred 
to also be the L?- limit of the functions f,. This is obvious if we have the 
benefit of the simplest results from Lebesgue theory.?° 

In particular, H?(G, 4), equipped with the scalar product 


(to = ff Hoa@ante), 


is a Hilbert space which plays an important role in some questions, especially 
complex Fourier transformations, the theory of modular functions, conformal 
representations, etc. 


In fact, as observed by Laurent Schwartz half a century ago, much more 
can be proved. Let G be an open subset of C and D(G) the vector space of 


27 As we will see in n° 12, it can be reduced to zero if the open set G is not bounded, 
especially in the trivial case when G = C since (17) then applies for all r > 0. 
Given a Cauchy sequence (f,) in an L?-space , if lim fn(«) = f(x) exists almost 
everywhere, then f € L? and lim || f — fn||p = 0 (Chap. XI). In the case at hand, 
the functions f, and f are continuous and the sequence converges uniformly on 
every compact subset. Hence the theorem is in fact about Riemann integrals. 
But a proof based on elementary arguments requires far more ingenuity than the 
use of Lebesgue’s sledgehammer. 


28 
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C@™ functions with compact support in G. Like on R (Chapter V, § 10), it 
can be used to define distributions on G. For all r € N, define a semi-norm 
on D(G) (Appendix of Chap. III, end of no 8) 


N,(y) = D2 sup |DPD3¢(z)| = D7 IDPDivlle » 


p.qsr 7 


where D,; and Dz are the differential operators with respect to the real coor- 
dinates x, y of z. Writing D(G, I) for the subspace of the functions y € D(G) 
which vanish outside a given compact set kK C G, a distribution of G is then 
a linear functional y + T(y) in D(G), continuous in the following sense : for 
any compact set K C G, there exists r € N and a constant Mx (T) > 0 such 
that 


IT(9)| < Mx(T)N,(y) for all. y € D(G,K). 


Like on R, the successive derivatives of the distributions can be defined by 
iterating the formulas defining the first derivatives 


DiT:p->-T (Diy), DoT: p+ > -T (Dey) 


of T. If T is defined by a function f of class C'° on G, ice. if 


r(y) = ff ol) feame), 


it can be immediately checked that D;T is defined by a function D;f : like over 
R, integrate by parts?? with respect to x or y. The notion of a holomorphic 
function can then be generalized by saying that a distribution T’ on G is 
holomorphic if it satisfies Cauchy’s condition 


D2T =iD\T 


or, equivalently, if JT’ vanishes for all functions of the form 0y/02. 

Having set this, (i) any holomorphic distribution is defined by a holomor- 
phic function, in other words this is not a generalization; (ii) if a sequence T), 
of holomorphic distributions converges to a distribution T, i.e. if 


limT;,(y) =T(y) forall ye DG), 


then T is holomorphic (obvious from the definition); (iii) if f, and f are 
holomorphic functions defining T,, and T, then f,(z) converges uniformly to 
f(z) on any compact subset of G. 

Regardingholomorphic functions, any definition of convergence more re- 
strictive than of convergence in the sense of distributions implies, therefore, 


29 As is zero outside a compact subset K C G, the function under the f sign 


is the restriction to G of a C® function on R? and vanishes outside K, hence 
outside a square I x I, where I is a compact interval of R. 
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compact convergence. This result covers every type of convergence encoun- 
tered in practice, in particular L? convergence considered above. 


(v) Analyticity of holomorphic functions. Cauchy’s formula for a circle 
assumes only that f is holomorphic in the sense initially defined in Chap. IT, 
n° 19. Using the theory of Fourier series, it was directly shown in Chap. VII, 
no 14 that in fact all holomorphic functions are analytic, i.e. have power series 
expansions, and (10) was deduced. Cauchy’s formula provides under proof of 
analyticity, that of Cauchy which everyone reproduced and which proceeds 
in reverse order. It suffices to replace a by a variable z in (10), and to write 
that 


We-g=tCYa=2o=) 2. 


n>O0 


This geometric series expansion is justified as long as |z| < |¢| = R. For given 
z, this series of functions of ¢ converges normally (Chap. III, n° 8) on the 
circle |¢| = R since it is dominated by the series )> g”/R, with q = |z|/R <1. 
As the function f is continuous on the circle, it can be integrated term by 
term. Hence 


(4.19)  2wif(z) =) fen2" with cn = | F(QE-M aC, 


where integration is along the circle of radius R, qed. 

Like the one of Chap. VII, this proof shows that expansion (19) holds 
in the largest disc D centered at 0 contained in the domain G where f is 
holomorphic. As indeed all circles centered at 0 are pairwise homotopic as 
closed paths in the domain G — {0} where the function f(¢)¢~"~! being 
integrated is holomorphic, the integral representing the c, is independent 
from R as long as the closed disc |z| < R is contained in G. Since the power 
series then converges for |z| < R, the result follows. This argument also shows 
that Cauchy’s integral formula for a disc characterizes holomorphic functions 
on the disc since it implies a power series expansion or else because the 
function being integrated on a compact set depends holomorphically on the 
parameter z, which allows theorems on differentiation under the [ sign to be 
applied. 


(vi) Laurent Series. Consider a function f holomorphic on an annulus 
G:r < |z| < R and let z be a point of G. Choose numbers r’ and R’ 
such that r <r’ < |z| < R’ < Randa ray D with initial point 0 which 
does not pass through z; let A and B be the points where it meets circles 
of radius r’ and R’. Consider the closed path jz consisting of AB, followed 
by the circumference |¢| = R’ oriented positively, then by BA followed in 
turn by the circle |¢| = r’ oriented clockwise. It is obviously homotopic in 
G — {z} to a circumference y centered at z contained in G. The integrals of 
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Fig. 4.6. 


f(¢)/(¢ — z) along these two paths being equal, Cauchy’s formula for a circle 
shows that 


(4.20) f(z)= = . Dac. 


The contribution from the line segment AB is clearly zero since it is fol- 
lowed twice in reverse directions. Hence, this leaves the difference between 
the extended integrals along the circumferences |¢| = R’ and |¢| = r’ oriented 
positively. 

For |¢| = R’, |z/¢| < 1 and so 


WG-2) = 1 2/Q)= paren. 
N 
The contribution from the circumference |¢| = R’ to integral (13) is, therefore, 
like in (v), the power series 
S42" with 27ic, =| FOC de. 
n>0 [C|=R’ 


For |¢| = 7’, |¢/z| < 1, which allows us to write 


1/(¢-—z) =—-1/2(1 Ga) er. 


n>=0 


The product with f(¢) is again integrable term by term: this is clearly a 
normally convergent series. Replacing n by —n — 1 where, this time, n < 0, 
its contribution is easily seen to be equal to 


ae ith 27ic, = FOC "ac. 
sce Zz” wi Tic [.. (¢) 


n<0 
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Hence formula (13) gives a series expansion in the annulus r’ < |z| < R’ 


(4.21) f= > ee" 
Z 


whose coefficients are obtained by integrating f(¢)¢~"~+ either over |¢| = R’, 
or over |¢| = r’. The choice of the circle is in fact unimportant provided it is 
in the interior of the annulus G': r < |z| < R since all circles are obviously 
homotopic in G as closed paths. And as the expansion holds for r’ < |z| < R’ 
provided that r <r’ < R’ < R, this means that it holds in all of G. This is 
Laurent’s theorem with coefficients given by an integral. 


5 — The Residue Formula 


(i) The residue formula. A formula (4.10) valid for any closed path py in G, 
any function f holomorphic on G and any z € G, as well as a more general 
and extremely classical result, can be proved in a simply connected domain : 
Cauchy’s residue formula, an inexhaustible source of exercises and examina- 
tion questions that, though sometimes subtle, have long become stale. Indeed, 
everything can be proved at the same time and G need not be assumed to 
be simply connected provided yp is taken to be null-homotopic in G. 

First some remarks about the Laurent series of a function f holomorphic 
for 0 < |z—a| < R, a being a singular isolated point of f (Chapter VII, § 4, 
n° 16). We saw above that it is given by 


(5.1) f(z) = ae _ a)” with 27ic, = / CG — a)~"—1d¢ 


where integration is over any circle t +> a+ r.exp(27it) of radius r < R 
centered at a. The sum of the terms of degree n < 0, i.e. the polar or singular 
part of the Laurent series, is a power series in w = 1/(z — a); it converges 
for 0 < |z—a| < R, ice. for |w| > 1/R; Now, the domain of convergence of 
a power series is the interior of a disc; if it converges outside a disc, then it 
converges everywhere. Therefore, the series given by the terms of negative 
degree in (1) converges for all z 4 a. So, in the neighbourhood of a, f is the 
sum of a power series and of a holomorphic function on C — {a}. 

Having said this, let us consider a function f which, instead of being 
holomorphic everywhere on a domain G, is so on G — S = G’, where S' is 
a temporarily finite set. Let p, = Res(f,a) be the residue of f at a € S, 
ie. the coefficient of 1/(z — a) in its Laurent series at a. Write ga(z) for the 
sum of the terms of degree < —2 — as seen above, it is in fact defined and 
holomorphic on C — {a} — and consider the function 


(5.2) 9(z) = f(z) — > [ga(z) + pa/(z — @)] ; 


each gq being holomorphic on C — {a}, g is at least defined on G’. Its only 
singular points in G are among the points a € S, but, since like in (2) the 
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terms indexed by 6 £ a are holomorphic with respect to a, the polar part of 
the Laurent series of g with respect to a is obtained by removing from the 
series of f that of ga(z) + pa/(z — a), ie. the sum of terms of degree < 0 
of the Laurent series of f. Therefore, in reality, the Laurent series of g with 
respect to a is a power series, and denoting the constant term of the Laurent 
series of f with respect to a by g(a), g is transformed into a function defined 
and holomorphic on all of G. 

To compute the integral of f along a closed path pz in G’, it then suffices 
to compute those of the functions ga, the functions 1/(z — a) and of g. 

By definition of the index, to begin with we have 


(5.3) / “n. = 2ni.Ind,,(a) . 
bb 


As, on the other hand, g is holomorphic on all of G and as by assumption, pu 
is null-homotopic, its integral along yz is zero (Corollary 3 of Theorem 3). As 
for the function gq, for ¢ # a, it is represented by a series 


ga(C) = ye en(¢ — a)” 


n<—2 


converging in C — {a} and without any terms of degree —1; it, therefore, 
admits a primitive 


dE en(G —a)"**/(n +1) 


n<—2 


on C — {a} (Chap. VII, n° 16: possibility of differentiating a Laurent series 
term by term). So irrespective of whether is null-homotopic or not, the 
integral of g, along p is zero. 

The only terms in sum (2) contributing effectively to the computation of 
the integral are, therefore, the fractions p,/(¢ — a). Hence, in view of (3), the 
relation 


(5.4) i f(O)d¢ = 2mi S~ Ind, (a) Res(f, a) 


aces 


follows. 

This result assumes that f has finitely many singular points in G. In fact, 
it remains valid when S is a possibly infinite closed and discrete subset of 
the topological space G or, equivalently, has a neighbourhood V for all z € G 
such that VS is finite, or else is such that KS is finite for any compact? 


30 “closed in G” means that every point of G (and not of C) is a limit point of S 
and is in S; “discrete in G” means that every z € S has a neighbourhood V such 
that VN S = {z}. Supposing that K C G is compact, if KMS is infinite, then 
there exists (use BL) a € K such that VMS is infinite for every neighbourhood V 
of a. Hence, we get a sequence of pairwise distinct points of S converging to a; 
since S is closed in G, it follows that a € S, which contradicts the discreteness 
assumption on S. 
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set K C G. To see this, let us first prove the following result which holds for 
far more general spaces : 


Lemma. For any open subset G of C, there exist a sequence of compact 
subsets K, and open subsets G,, such that 


@=|)ee. Kn C Gn C Kn4i- 


Every compact set contained in G is then contained in some Ky. 


For every z € C, set d(z) = d(z,C — G). As C — G is closed, relation 
d(z) = 0 is equivalent to z € C — G, and so G is the set of z € C such that 
d(z) > 0. Besides, it is obvious that 
(*) |d (2) —d(2")| < d(z’,2") =|2'— 2"| 
for all z’ and z’’. So the function d is continuous. Then define K,, by 

z2€K, —dz)>1/n & fzal<n. 


The subsets K,, are closed and bounded, and hence compact; they form an 
increasing sequence and G is obviously their union. Besides, every a € Ky, is 
in the interior of K,,,1 because, by (*), 


d(a,z) <1/n—1/(n+1) => d(z) => d(a) — d(a,z) > 1/(n +1), 


so that K,41 contains a disc centered at a. The set G, consisting of the 
interior points of K,41 is, therefore, suitable. Finally, if K is a compact 
subset of G, it is covered by the open subsets G,, hence by a finite number 
of them. So K C K,, for large n, qed. 

Having done this, let us return to formula (4) for a holomorphic func- 
tion on G—S and a closed path p in G — S, null-homotopic in G. As the 
path yp contracts to a point under a homotopy, it describes a compact com- 
pact set kK C G contained in one of the open subsets G,, of the Lemma. As 
G,NS =S,, is finite and as pu is null-homotopic in G,,, (18) applies to G, 
provided only the points a € S, are included in it. But as the result holds for 
sufficiently large n, passing to the limit is trivial (in fact, there are finitely 
many non-zero residues), qed. In conclusion : 


Theorem 5 (Cauchy’s residue formula). Let G be a domain in C, S' a 
closed discrete subset of G and f a holomorphic function on G’ = G—S. 
Then 


(5.5) / f()d¢ = 2ni S$ Ind, (a) Res(f, a) 
H aces 
for any admissible closed path p in G’ null-homotopic in G. 


If G is simply complex, the formula can be applied to every closed path 
in G—S. Hence, if the residues of f are all zero, then the integral of f along 
any closed path in G’ is zero. As a result : 
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Corollary. A holomorphic function f on G— S,where G in simply con- 
nected, has a primitive on G—S if and only if Res(f,a) =0 for allae S. 


The figure below is considered in the classical theory where S = {a1,..., an} 
is finite. It contains the path yz and a path v consisting, on the one hand, of 


Fig. 5.7. 


paths connecting in G' — S an arbitrarily chosen point w to points wy in the 
neighbourhood of a,, and on the other, of circular paths yz, centered at ap. 
The path v consists of a loop surrounding the point a,, followed by a loop 
surrounding the point az and so on until the point ay, the first loop consisting 
of the path from w to wj, followed by a path pi, from w 1 to w,, followed in 
turn by the first path in the reverse direction from w, to w, the other loops 
being defined similarly. When f is integrated along v, the contributions from 
the paths connecting w to w, cancel out , leaving the sum of integrals along 
small circles surrounding the points a,. But if f(¢) = )o en(¢ — ap)” is in- 
tegrated along a sufficiently small circle jz) centered at ap, the formula for 
the calculation of the coefficients when n = —1 gives 27ic_1. The integral 
of f along v is, therefore, 27i }> Res(f,a,). On the other hand, v and the 
initially given path py are “obviously” homotopic in G' as closed paths. Hence 
the integrals of f along uw and v are equal. This proves the residue formula. 

Except for one detail: this argument leaves out the factors Ind,,(a,). To 
obtain them, the above figure needs to be made more complicated when pu 
circles the points ap several times, in both possible directions. 
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This type of argument, heavily exploited by Riemann in his theory of 
algebraic functions — in his days, he did not have any choice — and by his 
many successors, has, nonetheless, supplied several useful formulas as we will 
have occasion to see later. 


(ii) Cauchy’s integral formula: the general case. To generalize Cauchy’s 
integral formula (4.10) with respect to the circle, we apply theorem 5 to the 
function g(¢) = f(¢)/(¢ — w), where w € G—S is given. It is holomorphic on 
G—{w}U S and it all amounts to calculating the residues. First, Res(g, w) = 
f(w) clearly holds since f(¢) = f(w)+f"(w)(¢—w)+... in the neighbourhood 
of w (Taylor series). In the neighbourhood of a point a € S, 


1/(¢ — w) = -1/[(w- a) - (¢-a)J=- YO (w-ay™™ G-@)” 


meEN 


if |¢ — a] < |w — al; Hence, if 


(5.6) icy= Galea" 


neZ 


in the neighbourhood of the singular point a, then the associativity theorem 
for absolutely convergent series (Chap. II, n° 18) shows that 


96)=- So en(w—ay™*(C-ay™™. 
nEeZm>0 


The residue of g is obtained by grouping together the terms for which m+n = 
—1 (same reference), and so 


(5.7) Res(g,a -S 6m i(w-a)y™ = > Cn (w — a)” 


meEN n<0 


This is the value up to sign at w of the polar part?! 
with respect to a. 
This formula simplifies if f has a simple pole at a, i.e. if its Laurent 
series with respect to a reduces to its term of degree —1, leaving Res(g, a) = 
c_1/(w — a) = —Res(f,a)/(w — a). Hence formula (5) applied to g gives 
the following result : 


of the Laurent series of f 


Theorem 6. Let G be a domain C, S a closed and discrete subset of G and 
f a holomorphic function on G— S$ with simple poles at every point of S. If 
weéeG-—S and if u is a closed path inG—SU{w} null-homotopic in G, then 


eth Res(f,@) 
i d¢ = Ind,,(w)f(w) + $7 Ind, (a oo 


acs 


31 Tt was shown above that this polar part converges in C — {a}. 
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This type of argument has several versions. 


Theorem 6bis. Let G be a simply connected domain, f a holomorphic 
function on G and ut a closed admissible path in G. Then 


(5.9) ~ | I(6) d¢ = Ind, (w).f™ (w)/n! 


Oni J, (C— wy 


for allw € G—Supp() and for alln EN. 


This result is the residue theorem applied to the function f(¢)/(¢—w)"*1; 
indeed, it has at most one singular point in G': a pole at z = w, with residue 
f(™(w)/n!, since Taylor’s formula 


Fla)/(z— wy? = (2—w) PTD (2 — wy f (w)/p! 
shows that the coefficient of 1/(z — w) is equal to f((w)/n!. 


(iii) The number of zeros and poles of a function. The residue formula has 
several other immediate consequences. Consider for example a meromorphic 
function f on a domain G; f being holomorphic and without any zeros on 
G — S, this means that there exists a discrete and closed subset S$ of G such 
that the points of S are zeros or poles, but not essential singularities, of f. The 
function f’(z)/f(z) is then holomorphic on G — S and its only singularities 
are the points a € S. In the neighbourhood of any point a € G, there is a 
series expansion 


(5.10) f(z) =e(z— a)? tepyi(z—a)Ptt +... 


with c, 4 0; the integer p is > 0 if f is holomorphic at a; it is > 0 if f(a) =0, 
and it is < 0 if ais a pole of f. Denote it by v,(f), so that if S is finite, then 


F(z) =ulz) []@- a), 


where the function u is holomorphic everywhere and 4 0 everywhere on G, 
and so has an inverse in the ring of meromorphic functions on G. This is 
the analogue of the decomposition of an integer into primes.®? Having said 
this, let us apply the residue formula to the function f’(z)/f(z), holomorphic 
outside S. By (10), 


f(z) = (2—a)?g(z), 
where g is holomorphic and non-zero at a, and so 
32 for extension to meromorphic functions in divisibility theory , see Remmert, 
Funktionentheorie 2, Chap. 3 and 4, where it is in particular shown that ev- 


ery meromorphic function on a domain G is the quotients of two holomorphic 
functions on G having no common zeros. 
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f(2)/f(@ = p/(z - a) + 9'(2)/9(2). 


The function g’/g being holomorphic at a, Res(f’/f,a) = p = va(f) and the 
residue formula then shows that 


(5.11) / ca dz =2ni \~ Indy(a)va(f) = 2riv,(f), 


where the sum, extended to all a € G, only has a finite number of non-zero 
terms. If for example f is holomorphic everywhere on G and if pw is a simple 
closed curve whose interior is contained in G and on which f does not vanish, 
then integral (11) allows us to calculate the total number of zeros of f in the 
interior of j4; this number takes into account the order or multiplicity va(f) 
of each zero. 

As a function of f, the left hand side of (11) satisfies a remarkable conti- 
nuity property resulting from that of the map f + f™ studied above. The 
open set G and the path wp being fixed, first note that if f does not vanish 
on Supp(z), then neither do functions g defined on G and sufficiently near 
f in the sense of compact convergence. Then choose a number r > 0 strictly 
less that the distance from Supp(j) to the boundary of G and let K C G be 
the compact set of points whose distance to Supp(z) is < r. Since f does not 
vanish on Supp(), for sufficiently small r, it does not vanish on K either. 
The lower bound d of |f(z)| in K is, therefore, > 0 if r is sufficiently small, 
and that of |g(z)| is > d/2 if || f — gll~x < d/2.If g is holomorphic, the results 
stated in n° 4, (iv) show that || f’ — g'||,(2) < Md where M does not depend 
on g. If || f — g||x is sufficiently small, the integral 


1) _ £0) a 


gz) f(z) 


is well defined and considering an upper bound for it,°° it follows that for 
\||f — gllx sufficiently small, |v,,(g) — up(f)| <1, and so = 0. As a result, the 
number v,(f) of roots of a holomorphic function f in the interior of a closed 
path p is a continuous function of f with respect to the topology of compact 
convergence. In particular, for every functions f holomorphic on G and all 
a € G, there exist r > 0, p > 0 and a compact set K C G such that the 
number of zeros in the disc |z — a] < r of any function g holomorphic on G 
and satisfying ||g — f||K <p is the same as that of f. 

If f and g are replaced by f —c and f —c’, where ¢ and c’ are constants, 
so that ||g — f\lx = |c — cl, then, for given c, the equations f(z) = c and 
f(z) = c have the same number of solutions in the interior of 4, provided 
\c’ —c| is sufficiently small. A direct argument: if f(z) —c does not vanish on 


Qmivy(g) — 2miv,(f) = [ | 


33 Tt all amounts to finding an upper bound for an expression of the form | f /g—p/q| 
when lower bounds > 0 of g and q and upper bounds for f, p, | f — p| and |g — q| 
are known . See rules of Chapter III, § 2, n° 7 with respect to algebraic operations 
on uniformly convergent sequences. 
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Supp(), neither does f(z)—c’ for c’ sufficiently near c; when c’ approaches c, 
f'/(f —¢) obviously converges uniformly to f’/(f —c) on Supp(j1), implying 
the result. This also means that the number v,(f —c) of solutions of f(z) =c 
in the interior of 4 only depends on the connected component of C —Supp(p) 
containing c. 

If for example f has a zero of order p at a and does not vanish elsewhere 
on a disc |z — al < r, then, for any sufficiently small c € C, the equation 
f(z) =c has p solutions in the disc. This shows that the image under f of a 
disc centered at a contains a disc centered at f(a) and, as a is arbitrary, that 
f maps every open subset of G to an open subset of C. Moreover, the roots 
of f(z) =c #0 in |z —a| < r are pairwise distinct for sufficiently small r 
since even if f’(a) = 0, f’(z) £0 for 0 < |z—a| <r if r is sufficiently small, 
which prevents f(z) = c from having multiple roots in this disc. 

For p = 1, this argument shows that f is injective in the neighbourhood 
of a if and only if f’(a) 4 0. If f is globally injective on G, then f’(z) 4 0 
everywhere and as shown above (and in Chapter III, §5, n° 24, by using 
the real-variable version of the local inversion theorem), f is a conformal 
representation on an open subset as remarked earlier. In conclusion : 


Theorem 7. Any holomorphic and non-constant function f on G maps open 
subsets of G to open subsets of C. It is a conformal representation of G 
on f(G) if and only if f is injective on G. 


(iv) Residues at infinity. The simplest functions the residue formula can 
be applied to are rational functions f(z) = p(z)/q(z), where p and q are 
polynomials without common roots. As will be seen later, using this method, 
the integral over R of any function of this type can be computed, at least when 
it converges. But an important theoretical result about residues of rational 
functions can now be proved. 

Indeed, integrate f over a circle |z| = R going around it once counter- 
clockwise. If R is sufficiently large, we get, up to a factor of 277, the sum 
of the residues of f at all its poles, which are the roots of g. Now, for large 
z, there is an asymptotic estimate of the form f(z) ~ cz”, with c 4 0 and 
n = d°(p) — d°(q), and in particular |f(z)| < M]|z|", where M is a constant. 
The integral over the circle, equal to 


(5.12) | 2nif [Re(t)] Re(t)dt, 


where e(t) = exp(27it), is, therefore, O(R"*'). So it approaches 0 if n < —2, 
Le. if 
(5.13) d°(q) > d°(p) +2. 


Hence assumption (13) implies the relation 


(5.14) S| Res(f, a)=0. 
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Otherwise, the argument falls apart otherwise and in fact (14) no longer 
holds. As seen in n° 4, (vi), on the exterior of a disc of very large radius R, 
the function f has a Laurent series expansion )* cpz? with 


a p= 
dmicy = fi NOP MC. 


For p = —1, we recover the integral of p along the circle, which is therefore 
equal to 27ic_;. In the general case, (14) needs to be replaced by the relation 


(5.15) S" Res(f,a) =c-1 . 


If in particular f(z) ~ c/z at infinity (case n = —1), then 


(5.16) S > Res(f,a) =¢= lim zf(z) . 


acc 


Setting (without forgetting the sign !) 
Res(f,0o) = —c_1 


to be the residue of f at infinity, instead of (14), we get 


(5.14’) S$” Res(f,a) = 0. 


a€CU {oo} 


This tautology is justified by its extension to much more general situations 
(compact Riemann surfaces) where it is far less obvious. 

This residue at infinity can be defined for any function f, both rational 
and otherwise, defined and holomorphic for large |z|. First of all, what will 
be meant by the behaviour of f “in the neighbourhood of infinity” needs 
to be specified. Such a function, whether rational or not, has a Laurent se- 
ries expansion f(z) = )>c,z” which converges for large |z|, with possibly 
infinitely many terms of negative or positive degree. It is then natural to say 
that f is holomorphic at infinity or on a neighbourhood of infinity if cn = 0 
for all n > 0 — in other words, if f(z) = O(1) for large |z| — and to set, by 
definition, 


(6.17) Foo) = ea = im f(2) 

If f(co) = 0, f will be said to have a zero of order p at infinity if 
(5.18’) f(z) =..-+ Cyne?" +¢_p»z ? with c_, #40, 
in other words if f(z) = 1/z? for large z. 


If f is not holomorphic at infinity, f will be said to have a pole of order p 
at infinity if 
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(5.18”) f(z) =... + Cp-12? + + ep2? with c #0, 


in other words, if f(z) = z? large large z. Otherwise, the expression essen- 
tial singular point at infinity is used, for example in the case of e*. Then, 
expansion (18”) leads to setting 


Res(f,0o) = —c-1 


as above, whence 
(5.19) if f(z)dz = —2niRes(f,oo) for large R 
|z|=R 


follows, attention needing to be paid to the sign on the right hand side... 

A rational function only has polar singularities in C = C U {oo}, in other 
words is meromorphic on ¢, and no other function satisfies this property. 
First of all, such a function f can only have finitely many poles because, 
even if it has a pole at infinity, it is holomorphic outside a compact disc 
D. However, it can only have finitely many poles in D. Multiplying f by a 
polynomial chosen so as to remove the poles of f in D, we get a function 
whose only possible singularity in Cisa pole at infinity; it is, therefore, an 
entire function on C whose order of magnitude at infinity is that of a power 
of z, and hence, by Liouville’s theorem (Chapter VII, § 4, n° 18, theorem 15), 
is a polynomial. The result follows. 


(v) The conformal invariance of a residue. At first sight, the definition 
of the residue of f at infinity seems strange; apart from the sign chosen, in 
the series expansion of f(z), the residue is the coefficient of a power of z 
approaching 0 as z approaches infinity, whereas the exact opposite holds for 
residues at points #4 co. This requires some explanation, found by replacing 
the variable z by 1/z. 

Indeed, computing a la Leibniz, the change of variable z = 1/¢ transforms 
the expression*4 w = f(z)dz into @ = f(1/¢)d(1/¢) = g(d)d¢, where 


g(6) = -f(A/Q)6-? = —(... Fea +...) 077 =... ea/O+.... 
Therefore, 
Res(f,0o) = —c_1 = Res(g,0). 


This suggests that the residue of a function f at a point a in fact involves 
the differential form w = f(z)dz rather than the function f itself; it would 
therefore be better to write it Res(w,a). To justify this, let us generalize the 
situation by considering a holomorphic function f on U—S, where S' is closed 
and discrete in an open subset U of C, and let us investigate what happens 


34 Tt is some sort of differential in the sense of Chapter IX; the transformation it is 


Te 


made to undergo here consists in computing its “inverse image” under z ++ 1/z. 
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to w = f(z)dz under a conformal representation z + y(z) = ¢ of U on an 
open subset V of C. If the inverse of y is the map w : V —> U, a formal 
calculation shows that w is transformed into 


w= f lhc) o'(¢)dc. 


The function f [w(¢)] W’(¢) is holomorphic on V — y(S), and the more general 
result we have in mind is the formula 


(5.20) Res(w, a) = Res[w, y(a)] , 


which holds for all a € S and expresses the invariance of the residue of 
a holomorphic differential form under a conformal representation. On the 
other hand, résidues at a and b = y(a) of the functions f(z) and f [~(¢)] 
are obviously not equal: for f(z) = 1/z, a = 0, y(z) = 22, ¥(¢) = ¢/2, 
flW(O)] = 2/¢ # 1/¢, but w = de/z and w = de/c. 

To prove (20), we may assume that a = y(a) = 0 and that U is an open 
disc centered at 0 containing no singular points of f other than 0. Then — 
use the Laurent series— there is a holomorphic function F on U — {a} such 
that f(z) = F’(z) + c/z, where c = Res(f,a). Then, 


w = F"(z)dz+cdz/z =dF + cdz/z, 
and so 


w@ = F'lh(Q)) v'(C)de + eb"(C)dc/o(6) - 


As F’ |w(¢)] ¥’(¢) is the derivative of F [W(¢)], the contribution of the first 
term to the residue of w@ at b = 0 is zero; Res(w,b) is, therefore, up to a 
factor c, the coefficient of 1/¢ in the Laurent series of 


¥(O/¥) = W'(0) +... ]/OC+...]. 


As y and ~ are mutually inverse, ~)’(0) 4 0, and so Res(w’/wv,b) = 1 and 
Res(@, b) = c = Res(w, a), which proves (20). 

If a conformal representation leaves the residues of a holomorphic differ- 
ential form w = f(z)dz invariant, it may be assumed that the integrals of w 
are also left invariant. To see this, consider a path yu in U — S and its image 
tr v(t) = ¢ [u(t)] under y; to compute the integral 


(5.21) : ms / fWOlY(Oac, 


by definition, ¢ needs to be replaced by v(t) and d¢ by v'(t)dt; w(C) is thereby 


replaced by w {y [u(t)]} = u(t) since w and y are mutually inverse, f [y(¢)] by 
f [u(t)], and w’ (¢)d¢ by y" [v(t)] v’(t)dt; but ~ being holomorphic, definitions 
obvious imply what we know, namely that 
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so finally 


(5.22) [a= f fuelu ar = i, w 


as expected. 

This result is independent from residue theory, but suppose that p is a 
closed path in U — S null-homotopic in U. In this case, v is a closed path in 
V — y(S), clearly null-homotopic in V. Then the residue formula (Theorem 
5) shows that 


» Ind, (b) Res(a, b) = \ "Ind, (a ) Res(w, a) 


follows from (22). This result can be applied to the case S = {a}, y(a) = }, 
where a is any point in U, and where f is a holomorphic function on U — {a}, 
for example 1/(z — a). The residues being equal, we conclude that 


(5.23) Ind, [y(a)] = Ind, (a) . 


This may seem obvious geometrically but is not so, especially as y could a 
priori transform a counterclockwise path py around a into a counterclockwise 
path v around b = y(a). 

Hence a conformal representation y leaves the “rotation direction around 
a point” of a closed path invariant; as will be seen in the next chapter, in a far 
more general situation, this is due to the fact that the Jacobian of y = p+iq, 
regarded as a map R? to R?, namely 


Jp(z) = Dip.Doq — Dop.D1q = I) 
is positive. 
(vi) Functions on the Riemann sphere. A new set 
C =CU {oo}, 


obtained by adding to C an element written oo, whose choice and nature 
matter little, was introduced above — reread Hardy in Chapter IJ, end of n° 2. 
Having done this, define a topology on C by setting U C C to be open ifUNC 
is open in C in the usual sense and if the exterior of a disc is contained in U 
when oo € U. Open subsets containing the point oo are, therefore, precisely 
the complements of the compact subsets of C in C. Axioms about unions and 
intersections of open sets are immediately seen to be satisfied. This topology 
of C allows us to define the notions of limit and continuity; for example, 
saying that a sequence of points z, € C approaches oo with respect to the 
topology of C means that |z,,| increases indefinitely since we need to intimate 
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that for any compact set K C C, z, € C—K for large n. With respect to this 
topology, C is compact. If indeed C is the union of a family of open subsets 
U;, one of them, say U;, contains the point at infinity, and hence also the 
exterior of a compact set kK C C; as K can be covered (BL) by finitely may 
sets CN U;, these U;, together with U;, form a finite cover of C, qed. BW can 
also be checked. . 

To understand the topology of C “geometrically”, consider the classical 
unit sphere S$? in R3 = C x R and, denoting its noeth pole by v = (0,0, 1), 
consider the map p associating to any ¢ € S? other than v the point z € C = 
R?, where the line passing through v and ¢ meets the equatorial plane C; this 
is the “stereographic projection” from the north pole used to map regions 
not too close to it. We thus get a homeomorphism from S$? — {v} onto C 
transforming the exterior of a disc of radius R centered at 0 in C into a set 
of points ¢ € S$? — {v} whose third coordinate satisfies a relation a < ¢3 < 1. 
Therefore, if we generalize the definition of p by setting p(v) = co, we obtain 
a continuous bijection from S$? onto Cc, hence a homeomorphism since the 
sphere is compact. Cis generally called the la Riemann sphere; I do not know 
whether he would have appreciated this tribute: it is like congratulating an 
Olympic cycling champion for having won the amateur criterion of his home 
town. 

C is indeed the only trivial example of a compact “Riemann surface” or, 
in today’s terminology, of a “compact complex analytic manifold of dimen- 
sion 1” (Chap. X): A reasonable definition of the notion of a holomorphic 
function on an open subset U of S$? follows by stipulating that such a func- 
tion should depend holomorphically, possibly also at infinity, on the point 
p(¢) € p(U), where p : S? —+ C is the stereographic projection. Without 
resorting to a cartography unlikely to yield useful generalizations in this type 
of context,?° a function f defined on an open subset U of C with values in 
C is said to be holomorphic if it is so in the usual sense when U C C, and in 
case oo € U, if it is so in the usual sense on UMC and approaches a finite 
limit f(co) at infinity; as the value f (co) is defined by (17), this amounts to 
saying that f(z) = g(1/z), where g is holomorphic in the neighbourhood of 0. 
Besides, classical definitions together with those given in (iv) for behaviour 
at oo makes it possible to give meaning to the notion of a “pole” of a func- 
tion when it is holomorphic in the neighbourhood of a point a € C, except 


35 The construction of C can be generalized to any locally compact space X: the 
open subsets of X=XU {oo} containing the point oo are set to be the comple- 
ments of the compact subsets of X. Thus X becomes the complement of a point 
in the compact space x the Alezandrov one-point compactification of X. For 
X = R, the space obtained is homeomorphic to the unit circle T. This can be 
seen by using the map t +> (t —7)/(t +72) from R to T — {1}; it can be extended 
by continuity to IR if a value of 1 is set for t = oo, and as it is then bijective 
and continuous, it is necessarily a homeomorphism. This construction transforms 
functions approaching a limit at infinity into functions on X continuous at oo. 
This is low level, but sometimes useful, general topology. 
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at a itself. In particular, any rational function f(z) can be interpreted on C 
as a function whose only singularities are its poles, of which are there are 
necessarily finitely many since C is compact; and we have seen that they are 
characterized by this property: rational functions are identical to meromor- 
phic function on C. 

To give the reader a less trivial example connected to the theory of elliptic 
functions, we choose a lattice ZL in C (Chapter II, § 3, n° 23) and we consider 
the set C/L of equivalence classes mod L, obtained by regarding identical two 
numbers z’, 2’ € C such that 2’ — 2” € L. Writing p for the map C —> C/L 
associating to each z € C its class mod L, a topology can be defined on C/L 
by setting a subset U C C/L to be open if and only if p~!(U) is open in C: 
to ensure the continuity of p, we confine ourselves to the minimum required. 
The space C/L is compact®® since, if we choose a compact subset K in C 
meeting all the classes mod L (for example a closed parallelogram generated 
by two basis vectors?” for L), then p(K) = C/L. As p is continuous, since BW 
or BL holds for K it holds for C/L (see Theorem 11 of Chapter III, §3, n° 9, 
whose proof generalizes immediately). Having said this, a function f defined 
on such an open set is, by definition, holomorphic if and only if the function 
z+ f [p(z)], defined on the open subset p~!(U) of C, is holomorphic in the 
usual sense (and doubly periodic since it is constant on the classes mod L). In 
this case, it would be easy to explain what is meant by the pole of a function 
defined in the neighbourhood of a point of C/Z, but not at the point itself; 
transposing what we are doing to C would be sufficient. In particular, a 
meromorphic function f on C/L only has finitely many poles since C/L is 
compact; composing it with p, we get a doubly periodic and meromorphic 
function on C: as we will see in Chap. XII, this is precisely what is called 
an elliptic function of the lattice LZ and we will show that, in this case, the 
meromorphic functions on C/L are just the rational functions in gz (z) and 
o',(z), where gz is the Weierstrass function of L (Chapter IT, §3, n° 23). 

Hence all this is only a matter of translation, but this point of view has 
proved itself exceedingly fruitful in much more general cases in most of which 
the construction is not at all as simple as in the last two examples, starting 
with the case of algebraic functions of one variable studied by Riemann, i.e. 


36 Tt is also necessary to show that C/L satisfies the Hausdorff axiom: that two 
distinct points have disjoint neighbourhoods; this is equivalent to saying that if 
a,a’ € C do not belong to the same class mod L, then there are discs D and D’ 
with centered respectively at a and a’ such that (D+ L)M (D’ +L) =, which 
is immediate. I have not mention this axiom in the Appendix to Chap. III in 
order not to steer the reader towards situations rarely encountered in everyday 
mathematics, but it is nonetheless fundamental. It obviously holds in all metric 
spaces. 

Topologically, C/L can, therefore, be obtained by taking a period-parallelogram 
P and by “gluing” its parallel sides pairwise; by gluing two parallel sides, we 
get a cylinder and by gluing the end circles, a ring. In other words, C/L is 
homeomorphic to the surface of a torus in R?. 


37 
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given, like the “function” ¢ = z!/3, by an equation P(z,¢) = 0, where P is 
a polynomial (see chap. X). 


6 — Dixon’s Theorem 


One may wonder in what cases Cauchy’s integral formulas continues to hold 
when p is not null-homotopic in G. Though the proof*® forms a double-sided 
page of ingenious but perfectly elementary arguments, it is was not until 1971 
that this was known without recourse to heuristic arguments. To state it, we 
need to again use notions defined at the end of n° 4, (i), namely the interior 
and exterior of a closed path yw. Hence 


C = Ext() U Int(z) U Supp(y) , 


these sets being pairwise disjoint and, except the first one, bounded. 


Theorem 8 (Dixon). Let G be a domain and yw a closed path in G. The 
following statements are equivalent : 


(i) The integral along of any holomorphic function on G is zero. 
(ii) The interior of is contained in G. 


(iii) For any holomorphic function on G and any a € G — Supp(1), 


(6.1) / DOT ica is Ind,,(a) f(a) 
nZ—a 

The proof consists in showing easy logical implications except for the 
second one. We will provide more details for it than its inventor. 

(i) = > (ii). For some a ¢ G, consider the function f(z) = 1/(z — a); by 
(i), its integral along yz is zero, but, up to a factor 277%, it is also the index of 
a with respect to py; the latter is, therefore, zero, proving (ii). 

(ii) => (iii). This is Dixon’s contribution. Define a function g on G x G 
by setting 


(6.2) 9(6,2) = [f) — FN /G-2) if C#z, 
=f'(z) if C=. 


By definition of the index, (1) is equivalent to 
6.1" | s6G-a)ae =o. 
wu 


To prove (1’) for a given path ju, we proceed step-by-step by showing that 


38 J.D. Dixon, A brief proof of Cauchy’s integral theorem (Proc. Amer. Math. Soc., 
29, 1971, pp. 625-626), reproduced in Remmert, Funktionentheorie 1, Chap. 9, 
§5, which I follow except for a few details. 


§ 2. Cauchy’s Integral Formulas 57 


) ae z) is a continuous function of (¢, z) € G x G, 

Lh =f, g(¢, z)d¢ is a holomorphic function on G, 

) ‘ can Be. analytically extended to all of C, 

d) the entire function thus obtained approaches 0 at infinity. 


Liouville’s theorem (Chap. VII, §4, n° 18) will then show that h(z) = 
everywhere, proving (1’) and the implication (ii) ==> (iii). 

(a) g is clearly a continuous function of the couple (¢, z) on the subset of 
G x G defined by the relation ¢ 4 z. Continuity at each point (a,a) of the 
“diagonal” of the Cartesian product G x G is less obvious. 

To simplify the notation, suppose that a = 0. The Taylor series f(z) = 
Yo cnz”" de f at a = 0 converges and represents f on a disc of radius R > 0. 
Let D be a disc |z| < r with r < R; hence 


f(¢) — f(z) = Soen (c — 2") = (¢ — z) So en (co +O z+ +201) 


n>1 
in Dx D. As g(0,0) = f’(0) = c1, it follows that 


[9(¢, 2) — g(0, 0)| = pa (cr-4 cmt. $2") | < 


n>2 


< ‘2 n|en|r”*. 


n>2 


This is a convergent series in r, without a constant term. As r approaches 0, 
so does its sum, proving the continuity of g. 
(b) It is then possible to define 


(6.3) h(2) = / oC, 2)de = , g [ylt), 2] p(t), 


where integration is along the given path uw : I —> G. As g[p(t),z] is a 
continuous function of (¢, z) on I x I by (a), and for given t, is holomorphic 
in z, the result is holomorphic in z by Theorem 9 stated below. 

(c) Ext(w) contains the exterior of a disc, its complement 


K = Int() U Supp(p) 


is compact, and by assumption (ii), contained in G. Since G is open, and 
hence distinct from K, the open set U = GM Ext() is not empty. So for 
z€U, z ¢ Supp(), and writing 


(6.4) @= fz © a Fle ) [ = 


=f? 19) — rif (z) Ind, (z) 


II 
ss 
bs 
IA 
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Fig. 6.8. 


is justified since Ind,,(z) = 0. However, the last integral obtained is well 
defined for all z ¢ Supp(y), in particular on Ext(), and hence is obviously a 
holomorphic function of z. As it coincides with h(z) on the non-empty open 
set GM Ext(4), we get a holomorphic function on the open set GU Ext(j) by 
requiring it to be equal to h on G and to the integral in question on Ext(ju). 
But as C — Ext(j) is contained in G, GU Ext() = C. The new function h 
is, therefore, defined and holomorphic on all of C. 

(d) The reason why this entire function approaches 0 at infinity is obvious: 
if the compact subset Supp(jz) is contained in the disc |¢| < R, with R finite, 
and if z is exterior to it, then |¢ — z| > |z| — R for all ¢ € Supp(y), and so 


|h(z)| < m(u)/ (lel — R) 5 


which gives the result. Liouville’s Theorem then shows that h(z) = 0 for all 
z €C, proving (ii) => (iii). 

(iii) ==> (i). If (1) is satisfied for any f, it also is for g(z) = f(z)(z — a); 
as g(z)/(z — a) = f(z) and as g(a) = 0, we get f f(z)dz = 0. This ends the 
proof. 

We still need to justify fully point (b) of the previous proof. This is the aim 
of the next theorem, also very useful in many other circumstances. Basically, 
it is almost always applied to a measure of the form du(t) = p’(t)dt where 
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the function p(t) is regulated, but the general case is not hard since, in all 
questions of this type, the only properties of measures really used are their 
definitions — linearity of f ++ u(f) and upper bound in terms of the uniform 
norm of f — and elementary theorems about passing to the limit under the [ 
sign that follow directly from the definition. 


7 — Integrals Depending Holomorphically on a Parameter 


Theorem 9. Let I be an interval, 4 a measure on I, U an open subset of C 
and f: Ix U —C a function satisfying the following conditions : 


(a) f is continuous on I x U, 

(b) f(t,z) is a holomorphic function of z for allt € I, 

(c) for any compact set H CU, there exists a positive j-integrable®” func- 
tion py(t) on I such that 


(7.1) |f(t,z)| < pa(t) for all te 7 and ze. 


Let f(")(t,z) be the derivative of order r of z+ f(t,z). Then f(t, z) sat- 
isfies conditions (a), (b) and (c) for all r, the function 


(7.2) (2) = f Flt,2)du(t 
is holomorphic on U, the functions f”) (t,z) are p-integrable and 


(7.3) g(z) = / f(t, z)du(t) for all reN. 


It is sufficient to prove the statements with respect to r = 1: the general 
case will follow by a repeated application of the result. 

First proof. Theorem 24 bis of Chap. V, $7 is analogous to the result 
we need to prove but based on different assumptions: f’ (instead of f) was 
assumed to satisfy conditions (a) and (c). At that point, the only tool at 
our disposal was indeed the formula for differentiation of an integral with 
respect to a real parameter; it assumes that the derivative being integrated is 
continuous and that its integral converges normally on compact sets. We then 
obtained the holomorphy of (1) and formula (2) by differentiating integral (1) 
with respect to coordinates x and y of z and by checking Cauchy’s condition 


39 Tf du(t) = p'(t)dt with p’(t) regulated, this means that f px(t)|y’(t)|dt < +00. 
In the case of an arbitrary positive measure, py(t) can be assumed to be Isc (and 
in practice, even continuous) because a (Lebesgue) integrable positive function is 
always dominated by an integrable Isc function. Besides, note that (c) is always 
satisfied when I is compact because f is bounded on the compact set I x H, so 
that it suffices to choose the constant function py(t) = sup|f(s, z)|, sup being 
extended to (s,z) € 1 x H. 
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Dogg = iD\g. To prove theorem 9, it will, therefore, suffice to show that f’ 
satisfies (a) and (c). 

To prove that f’ is continuous, take a point a of G. Let R be the distance 
from a to the boundary of G. Assuming a = 0 to simplify formulas, the open 
disc D : |z| < r is contained in G for r < R and Cauchy’s integral formula 
shows that 


(7.4) f(t) = ji [t, re(u)] [re(u) — z)? re(u)du 


for |z| <r, where integration is over J = [0,1]. The function under the [ 
sign depends on variables u € J, z € D andt € J and, in view of the simplest 
results on integrals depending on parameters [Chap. V, n° 9, Theorem 9, 
(i)], it all amounts to showing that this function of (u, z,¢) is continuous on 
J x Dx I, which is clear. 

To show that f’ satisfies (c) for any compact set H C U, it suffices (Borel- 
Lebesgue) to show this in the neighbourhood of all a € U, for example a = 0. 
As the points re(w) remain in a compact set U, by (c), there is a positive, 
p-integrable function p such that 


|f (t,re(u))| < p(t) for all t€ 7 andall u. 


If z remains in the disc D’ : |z| < r/2, then |re(u) — z| > r/2, whence 
|re(u) — z|~? < 4r~? and so | f’(t, z)| < Mp(t) for all t € I and z € D’, with 
a constant M independent of t and z; thus (c) follows for f’. 

It remains to apply Theorem 24bis of Chap. V. For the reader’s con- 
venience, we recall its proof. For this suppose pz positive, a case it can be 
reduced to. For any compact interval K C I, set 


gx(z)= |} f(t, z)du(t). 
K 


Regarded as a function of t, x = Re(z) and y = Im(z), the function f 
has derivatives Di f(t,z) = f’(t,z) and Dof(t,z) = if’(t,z) with respect 
to x and y, and, like f’, these are continuous functions on K x U. Since 
integration is over a compact set, it is possible to differentiate under the [ 
sign with respect to x or to y (Chap. V, § 2, n° 9, Theorem 24); the derivatives 
are obviously 


Drgx(2) = - f(t, 2)du(t), Dege(2) =: I f'(t, z)dy(t). 


As f’ is continuous and K compact, they are continuous and satisfy Cauchy’s 
condition. The functions gx are, therefore, holomorphic, with 


die(2) = I f(t, 2)du(t). 
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To finish the proof, consider a compact subset H of U and an upper bound 


|f’(t, z)| < pa(t) valid for t € I and z € H, with a positive p-integrable 
function py. Then, for any compact set K C J, 


| [ #e2.dult) — del2) 


< | Pa (t)dult 


for all z € H, and since the right hand side approaches 0 as K “approaches” J, 
it can be concluded that g/,(z), and hence also the partial derivatives of gx 
with respect to x and y, converge uniformly, up to a factor i, to [ f’(t, z)du(t) 
on H. The function g = lim gx, therefore, has partial derivative with respect 
to « and y obtained by passing to the limit over those of gx (Chap. III, §4, 
Theorem 19). They are continuous and like those of gx satisfy Cauchy’s con- 
dition, so that g is holomorphic,*° with g'(z) = limg’,(z) = f f’(t,z)du(t), 
qed. 

Second proof. Theorem 9 can also be proved by using Weierstrass’ the- 
orem on uniform limits of holomorphic functions (Chap. VII, § 4, n° 19, The- 
orem 17). 

Let us first consider the case where J is compact. In what follows, suppose 
that z remains in a compact subset H of U and set fi(z) = f(t,z). As Ix H 
is compact, f is uniformly continuous on it. In particular, for every r > 0 
there exists r’ > 0 such that 


(7.5) la—o| = |lfe— Alle <r 


Having said this, let us partition J into finitely many non-empty intervals J; 
of length < r’, choose points t, € I, and compare integral (1) to the Riemann 
sum )> f (tk, z)“Ux). Since, by (5), |f(t,z) — f(tr, z)| <r for allt € J, and 
all z € H, it follows that 


lolz) — 92 F (te, 2) #Ua)| < Ilnllr for all z € H, 


where ||u|| is the norm of yz. This means that the function g is a uniform 
limit of holomorphic functions on every compact set H C U; it is, therefore, 
holomorphic. Moreover, since a limit of holomorphic functions can be differ- 
entiated term by term according to the same theorem, g)(z) is the limit 
of the expressions > f‘?)(t,, z)u(I,), which are just the Riemann sums with 
respect to integral (2), and so the theorem follows when J is compact. 

In the general case, replace J by a compact interval kK C J and pass to 
the limit as in the previous proof. 


40 This is the “useless” result mentioned in Chap. III, §5, at the end of n° 22. 
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Third proof. Let us directly show that g(z) has a power series expansion 
on the interior of any compact disc D C U. This reduces to the case where D 
is the disc |z| < R. As z +> f(t,z) is holomorphic and hence analytic, there 
is a Taylor series expansion on D 


(7.6) fz) => fM@EH2” 
with zl”] = z"/n! and 
(7.7) f™t,0)RM = / f (t, Re(u)) e(—nu)du 


for t € I (Fourier series... ). Since, according to assumption (c), | f(t, z)| < 
pp(t) for t € I and z € D, it follows that 


f(t,0)| < po /Re 


for all n and all ¢ € I. For |z| = qR with q < 1 and all r € N, the Taylor 
series 


(7.9) SFE, Oak = FM E,z) 
is, therefore, dominated for all t € J, z € D and p € N by the series 


(7.10) = S pp(t)q™!R/REH" = ppo(t)R-" Sg" (n+r)!/n, 


which converges since (n + r)!/n! < n” for large n. If K C I is compact, the 
continuous function f(t, z) is bounded on K x D and pp(t) can be replaced in 
(10) by a constant independent of (t¢, z) € K x D, so that the Taylor series (9) 
converges normally on K x D. As a result, the function f(t, z) is continuous 
on Kk x D for any K, hence on IJ x D, and so on I x U since the argument 
can be applied to any compact disc D C U. 

As pp(t) is also a factor in (10), 


(7.12) DY [fre oe 


n>0 


(7.8) 


where M, is a constant. Series (9) can, therefore (Chap. V, § 7, Theorem 20 
generalized to an arbitrary measure), be integrated term by term on J, and 


sO 
feo (t, z)du(t) = one” "1 od an = f (t, 0)dp(t) 


For r = 0, this shows that g(z) = 7 anz!"), and, therefore, that g has a power 
series expansion and that, moreover, 


fre (t, z)du(t) = ayacel ag! "ez ys 


This is relation (2). That f‘")(t,z) satisfies condition (c), a fact of local 
nature, follows immediately from (12). 
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§ 3. Some Applications of Cauchy’s Method 


In this § aimed at showing that even a limited knowledge of Cauchy the- 
ory allows us to do mathematics that does not merely amount to irrelevant 
exercises, the following notation will be systematically used: 

L1(R) will denote the set of functions defined and absolutely integrable 
over R with respect to the usual measure dx; the reader is free to interpret 
this notation in the sense of Lebesgue theory. In fact, as is done in Lebesgue 
theory, we will often say “integrable” instead of “absolutely integrable”, 
even if this means warning the reader when we will encounter semi-convergent 
integrals (Chapter V, § 7); 

F1(R) will denote the set of continuous functions f4! on R such that both 
f € L‘(R) and f € L1(R) hold; Fourier’s inversion formula applies to these 
functions (Chapter VII, §6, n° 30, Theorem 26). 

Recall that, for us, the Fourier transform is defined by the formula 


fo) = f Fleel-ey)ae 
where integration is over R and where, for every z € C, as in Chapter VII, 
e(z) = exp(27iz). 
So e(—a) = e(x) for x € R and 
le(z)| = exp(—27y), y = Im(z) 


for all z € C. 
Expressions such as the following will be frequently used: 


f(z) =O (g(z)) at infinity on U, 


where U is a subset of C; this means (Chapter II, n° 3) that there exists 
M >0and R > 0 such that 


zEU & |2a| > R==> |f(z)| < Moz) - 
Similar conventions apply to relations 0, < and ~. We will also write 
f(z) x g(z) on U 
if there are constants m,M > 0 such that 


m|g(z)| S$ |f(2)| < M|g()| forall z€eU 


41 A somewhat superfluous condition: in Lebesgue theory, any f € L'(IR) whose 
Fourier transform is integrable is shown to be equal “almost everywhere” to a 
continuous function given by the inversion formula. 
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and not only large z or near to a given point. The next result often serves as 
an example if U =C—R_: 


Lemma. Let U C C* be a domain on which there is a uniform bounded 
branch*? of Arg(z). Then 


2 |z[Ret*) on U 


for any uniform branch of z*° on U. 


Indeed, z* = exp(s Log z), where Log z = log |z| +i Arg z for any uniform 
branch on U of the argument. Since 


Re [s Log z] = Re(s) log|z| — Im(s) Arg z, 


|z*| = Lalor) e~ Im(s) Arg z ; 


Since by assumption Arg(z) remains in a compact subset of R, for any z € U, 
the exponential lies between m and M > 0, qed. 
We will most often write 


[ t@ae or [seu instead of 7 f(x)dz ; 


—co 


there will be no confusion as we will never use the absurd [ f(2)dx to denote 
a primitive for a function f. On the other hand, d*x will denote the positive 
measure on R* or sometimes on R*, but not4? on R, defined by the formula 


+oo 


[ seayate =f fla)fal tae 


—co 


for f continuous and zero in a neighbourhood of 0 and of infinity, and 
more generally for any function making the integral absolutely convergent; 


#2 This is not always the case, even when U is simply connected. For a counterex- 
ample, take U to the complement in C of a spiral with initial point the origin 
and tending towards infinity, for example the curve t + te(t), t > 0. 

Recall (Chap. V, § 9) that if X C C is locally compact (i.e. the intersection of an 
open and of a closed set), then a positive measure js on X is a linear functional 
f — u(f) on the vector space L(X) of continuous functions on X that are zero 
outside a compact subset of X, satisfying u(f) > 0 for f > 0. A general measure 
is a linear functional on L(X) such that, for any compact set K C X, there is 
an upper bound 


43 


IM(f)| < Mr||fllx 


for any f € L(X) zero outside K. The measure d*x is not a measure on R 
because the integral f{ f(x)|z|~*dz is not defined for f € L(R), if we do not at 
least require f to be zero at 0. 
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the relevance of introducing such a measure lies in its invariance under 
“multiplicative translations” «+ ax(a £0) and under +> 1/a: 


[fanaa f peare, f fajaya'e =f fade. 


In other words, d*x plays the same role for the multiplicative group R* as 
the Lebesgue measure dx does for the additive group R. 


8 — Fourier Transform of a Rational Fraction 


(i) Absolutely convergent integrals of rational functions. A seemingly trivial 
example, but which illustrates one of the most used techniques in practice, 
is the computation of the integral of f(a) = 1/(1 +?) over R; the given 
function having arctg x as primitive, the result obviously equals 7. 


Fig. 8.9. 


Consider the above path pz in C. The function f is holomorphic on C 
except at z =7 or —7; the formula 


2i/ (1+ 27) =1/(z-i)-1/(z+i) 


shows that the residue at i is equal to 1/2i. The value of the integral along 
wis, therefore, 27i/2i = 7. 

Having done this, consider the contribution from the half circle to the 
computation. Its length is tR. As 1/(1 + 22) = O(1/|z|?) for large |z|, the 
general upper bound (4.9) shows that this contribution is O(1/R), and so 
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R dx 
w= f roac= fa +0U0/R). 
As the integral over [—R, R] approaches the integral sought, it is equal to 7 
as expected. 

It may be wondered why we choose to integrate over a half-circle rather 
than over other curves. The most probable reason is that, for the last two 
thousand five hundred years, not to go further back to homo erectus fasci- 
nated by the Moon and the Sun, the circle is justifiably an object of adoration 
for mathematicians. But we might as well integrate over the upper or lower 
part of the square bounded by the lines Re(z) = R or —R and Im(z) = 0 or 
R. The main point is that its length should be O(R) as R increases indefi- 
nitely, where R denotes the minimum distance from the origin to the chosen 
contour points; it is even possible to go until a length of the order of 0(R?). 

The method generalizes to integrals of rational fractions f(a) = p(x)/q(x), 
where q is assumed not to have any real roots and, to start with, that d°(q) — 
d°(p) = n > 2 so as to ensure that f € L1(R). f is integrated along the 
same contour 2 as above; choosing R sufficiently large for the roots of q in 
the upper half plane to be in the interior of yw, we get 


(8.1) 2ni S°  Res(f,a), 
Im(a) >0 


a sum extended to the roots of g in the upper half plane. Besides, the integral 
over j is the sum of the integral of f over [—R, R] and of the integral along 
the half-circle of radius R. If d°(q) — d°(p) =n, then 


(8.2) p(z)/q(z) ~ c/z” for large |z2| 


with a constant c # 0. For large R, the integral along the half-circle is, 
therefore, 7R.O(R~”) = O(R'~") and tends to 0 since n > 2. Whence the 
final result: 


(8.3°) / fide = on Pe fay, 


Instead of integrating along the above contour, its symmetric with respect 
to the real axis would do as well; as it is followed clockwise,we get 


(8.3”) / f(v)dx =—2ri SY” Res(f,a). 
Im(a) <0 


Comparing these results shows that 


(8.4) Y= Res(p/g,a) =0 if d°(q)—d(p) > 2, 
aEeCc 


which has already been shown in (5.14). 
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If all the roots of g are simple, 


p(z)/a(2) 


[p(a) + p'(a)(z-@) +...]/[d'(a)(z-@) + ..] 
p(a) f+2( 


q'(a)(z—a) 


in a neighbourhood of such a root a since the quotient of two power series 
starting with 1 is a power series of the same type. Hence the formula 


Res(p/q, a) = p(a)/q'(a) 


that allows us to compute the integral. 

Let us for example choose the function f(z) = p(z)/(1 + 27”), where p 
is a polynomial of degree < 2n — 2; its poles are (at most) the roots of the 
equation 2?” = —1 = exp(zi), i.e. the 2n points 


z—a)+...| 


wp =w*tl where w = exp(mi/2n), O< k<2n-1. 


The roots with positive imaginary parts are obtained for 0 << k <n-—1 and 
the residue at wz is equal to 


p(wr) /2nw2"-* = —p(w,)w,/2n. 


So 


-n—-i1 
p(t) 4 _ mt 


For p(x) = 21, the sum 


Sour = » yy(2k+1)m =qw™ vo =w™ Lage = (—1)” =1 
0 


1—w2m wm — wom? 
<k<n-l 


needs to be calculated so that the integral sought equals 0 if m is even (ob- 
vious!) and 7/nsin(ma/2n) if m is odd. 


(ii) Semi-convergent integrals of rational functions. The previous method 
can also be applied if d°(q) = d°(p) + 1. But some precautions are called 


for since the extended integral over R is no longer absolutely convergent. 
Nonetheless, we can set 


+R R 
(8.5) f(x)dx = lim f(x)dz = lim (f(a) + f(—a)| da . 
R -R 0 


This limit exists since 


f(z) =c/z+O(1/2”) at infinity 
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and hence f(z) + f(—z) = O(1/z?). In the decomposition of f into simple 
elements, the integrals of terms of the form A/(z— a)" with r > 2 can be 
computed as before; computations may even be omitted because they are 
zero, either because that is the case of the corresponding residues or simply 
because the function 1/(~—a)” admitting a primitive approaching 0 at infinity 
for r > 2, the FT solves the problem without having to invoke Cauchy. 

It, therefore, remains to compute the above integral for f(z) = 1/(z—a), 
a ¢R. A first method consists in observing (Chapter V, §6, n° 20) that, on 
the simply connected open subset G = C — R_, the function 1/z admits as 
primitive any uniform branch L(z) of the pseudo-function Log z, for example 
the one obtained by setting 


L(z) =log|z|+i.Argz with |Argz| <7. 


As a is not real, the points 2 — a are in G for x € R, so that D(a — a) can be 
chosen as a primitive for 1/(~ — a) on R. Hence 


i as 


—Rt—a 


The variation of the real part of 
L(a — a) = log |x — a| + 7. Arg(a — a) 


over [—R, R] is equal to log(|R—a|/|R+a]) and tends to 0 since |R—a|/|R+a| 
approaches 1 as R increases; the argument of R—a tends to 0 since the half- 
line with initial point 0 and terminal point R—a approaches the half-line R,; 
finally, the half-line with initial point 0 and terminal point —R—a approaches 
R_, but is in the half-plane Im(z) < 0 if Im(a) > 0 and in the half-plane 
Im(z) > 0 if Im(a) < 0; the argument of —R — a, therefore, approaches —1 
if Im(a) > 0 and +7 if Im(a) < 0. So 


mi if Im(a) >0 
nmi if Im(a) <0° 


(8.6) | dxu/(x —a) = 
a = 
Finally, in the general case, it follows that 


(8.7) [fede = ye Res(f,a) — i > Res(f,a) . 
I 


m(a) >0 Im(a) <0 


When d°(q) > d°(p) + 2, the sum of all the residues of f in C are zero and 
(7) reduces to (3’) or (3”), as desired. If on the contrary d°(q) = d°(p) +1, it 
is the sum of residues in C and at infinity, which is zero by (5.14’). Then we 
for example find 


(8.8) [see = Qri x Res(f,a) + 77 Res(f, 00) . 


Im(a) > 0 


§ 3. Some Applications of Cauchy’s Method 69 


A quicker second method consists in integrating over the same closed 
contours as in the absolute convergence case. Start with the relation f(z) = 
c/z + O(1/z7) where c = — Res(f, oo) — careful with the sign! — and observe 
that the integral of O(1/z?) over the half-circle approaches 0. Therefore, the 
integral of f along the latter approaches the same limit as that of c/z; it is 
calculated by setting z = Re”, and so dz /z = idt, and as integration is over 
(0,7), the result is equal to mic = —7i Res(f,co). Taking into account the 
poles in the interior of the integration contour, 


[soa — mi Res(f,00) = 27 a .Res(f, a) 


Im(a) >0 


This again leads to (8). 


(iii) Absolutely convergent Fourier transforms. For t real, let us now con- 
sider the Fourier integral 


(8.9) f(t) = / fie-es= / Peet Satta 


where f = p/q is again a rational fraction without any real roots; here too 
the integral is absolutely convergent if n = d°(q)— d°(p) > 2. 
Suppose first that t > 0. The function 


g(z) = f(zje(-tz) 


is holomorphic on C deprived of the roots of q; f(z) ~ c/z” for large |z], 
though |e(—tz)| = exp(2aty) is < 1 on the half-plane Im(z) < 0. Hence, if g 
is integrated over the contour formed by the interval [—R, R] followed by the 
lower half-circle of radius R, its contribution is O(1/R"~+) for large R, and 
so approaches 0. As the integration contour is followed clockwise, the index 
of a point in its interior is —1 and the residue theorem shows that 


(8.10") f(t) =—-2mi YO Res[f(ze(-tz)]_ (> 0), 


z 
Im(a)<0 


the notation for the residue being self-explanatory. For t < 0, we use the 
upper half-circle on which e(—tz) is bounded, and so 


(8.10”) fQjat2m SO Res [f(z)e(—tz)]  (t <0). 


Im(a)>0 


For t = 0, we recover the results of section (i). 
Suppose for example that 


(8.11) f(a) = (2? +w?)* —( : : ) 


2iw \x-iw «e+iw 
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with Re(w) > 0. The function g(z) = exp(—2zitz) f(z) has simple poles at 
iw and —iw, and clearly“4 


Res(g,iw) =e?" /2iw, Res(g, —iw) = —e-?"*” /2iw. 
As Im(iw) > 0 and Im(—iw) < 0, multiplying by 277, we get 


—t me 27 /w if t>0 
[ie — / ; if Re(w) > 0; 
xv*+w merry fay if t<0, 


in other words 


e(—t 
(8.12) [Se =ne?7l4l /w if Re(w)>0. 
Note that |e~?7!4l| = e~ 27/4 Re(~) approaches 0 exponentially as |t| increases 
indefinitely since Re(w) > 0, so that f € L'(R); in other words, f € F'1(R). 
Fourier’s inversion formula can, therefore, be applied, and so 


| lie =w/n(x?+w). Re(w) >0, 


This formula easy to check directly: integrate over t > 0 and t < O taking 
into account that e“' has e“/c as primitive for any c € C*. 


To obtain an explicit form of f(t) in the general case, observe that any 
rational fraction is the sum of a polynomial and of a linear combination of 
functions of the form 


(8.13) f(z) =(@—a)™, 


where n is an integer > 1 and a is a constant; its Fourier transform is well- 
defined only if its decomposition does not have any polynomial terms and if 
its poles are not real. Hence if an explicit formula is found for the Fourier 
transform of (13) for a ¢ R, the decomposition into simple elements will give 
the result in the general case. In this section, we will suppose that n > 2 and 
that Im(a) > 0, the case Im(a) < 0 being similar. 

The only pole of the function 


g(z) = (z— a) "e(—tz) 


being a, it is already possible to conclude that f(t) = 0 for t > 0. For t < 0, 
the residue at a of 
44 If y has a simple pole at a and if y is holomorphic and non-zero at a, then 
g(z)(z) = [ez -— a) +...] ba) +. ], 
and so Res(yw, a) = w(a) Res(y, a). 
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7 a)~"e(—ta)e [-1(z — a)] = 


= e(—a ao | —2rit(z — a)]®! 


N 


needs to be calculated. Here we use the notation of divided powers a!"] = 
x” /n!. As we want to find the coefficient of (z — a)~+, it follows that 


Res(g, a) = e(—at)(—2mit)"-4, 


Formulas (10’) and (10”) then show that 


Qni(—2it)™—Ne(—at) if t <0 
if t>0 


(8.14) Jo a) "e(—tx)dz = if Im(a) >0. 


To avoid major mistakes, it is worth checking that f(t) approaches 0 at 
infinity.*° 

Allowing for notation, formula (14) was announced in Chapter VII, §6, 
n° 27, example 1 without then being in a position to prove it; We lacked 
Cauchy theory to be able to justify it. Replacing t by —t and denoting by t+ 
the function equal to t for t > 0 and to 0 otherwise, (14) can also be written 
as 


(8.15) / (x —a)~"e(tax)dax = (27i)"t? “e(at), Im(a)>0, n>2. 


Formula (15) can also be written 


/ (2 — a)~"e [t(a — a)] dx = (2ni)"t?" 


Setting « — a = z, the integral over R is transformed into an integral a la 
Cauchy along the (unbounded) path Im(z) = —Im(a) = c < 0: 


(8.15’) | 2 "e(tz)dz = (2ni)rer—4 
m(z)=c <0 


The change of variable tz = ¢ transforms the horizontal Im(z) = c into a 
horizontal Im(¢) = tc = —tIm(a) = b. As (~"d¢ = t!""z~"dz and as b and 
t are of opposite sign, relation (14) is equivalent to 


= (27i)"/(n—1)! if bB<O, 
f  metoac = | 
Im(¢)=b 0 if b>O. 
The function being integrated is holomorphic on C except at ¢ = 0. It, 


45 The Fourier transform of an absolutely integrable function f is continuous and 
approaches 0 at infinity: Chap. VII, n° 27, theorem 23. 
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Fig. 8.10. 


therefore, does not come as a surprise that the integral only depends on 
the sign of b. To justify this directly, we integrate along the contour of the 
rectangle ABCD of the above figure; by Cauchy’s theorems, the result is zero, 
and it all boils down to showing that, as T’ tends to +00, the contributions 
from the vertical sides approach 0. However, on these sides, 


\o-"e(¢)| = (7? + 1?) exp(—20n), 


where 7 = Im(C) varies between b and b’, and so 0 < m < |e(¢)| < M < +00 
with constants m and M independent from T. On the other hand, (T? + 
7?)-"/? remains between its values at 7 = b and n = b/; |C~"e(¢)| is x T7” 
on the vertical sides. Their contribution is, therefore, O(T~"”), qed. 

This argument also explains why the value of the integral over the positive 
horizontals is different from that on the negative ones: in such a case, there is 
pole at ¢ = 0 in the interior of the rectangle ABCD, so that, up to a factor 
27, the integral is equal to the residue of ¢~"e(¢) at 0. 

Exercise 1. Translate (15’) using the change of variable 27itz =. 

Exercise 2. Applying Poisson’s summation formula to the function 1 +> 
(x — z)~*, show that 


1 (—27i)* k—-1 2rinz 
(8.16) Lierah Gale” e for k>2, Im(z)>0. 
Z “n>1 


Recover (16) by differentiating the partial fractions expansion of cotg 7z. 
Exercise 3. Check that Plancherel’s formula holds for the Fourier trans- 
form (14). 


(iv) Semi-convergent Fourier transforms. The calculation of the Fourier 
transform of a rational function f = p/q assumes that d°(q) — d°(p) =n > 2. 
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As in (ii), the more subtle case when n = 1 can be handled; as we will see, 
the integral defining f(t) is then semi-convergent for all t, which we already 
know to be the case for t = 0. 

Note first that 


f(z) =c/z+O(1/z?) at infinity 


as already remarked above. So only the term en 1/z poses a problem in the 
calculations and estimations which were successfully done for n > 2. 

On the other hand, if the function f is real on R, which may be assumed, 
its derivative only has finitely many real roots and hence its sign remains 
constant for large |a|; f(a), therefore, monotonously tends to 0 as x € R 
tends to +00 or —oo, so if the Fourier integral is defined by the formula 


R 
(8.17) f(t) =lim ip , f(x)e(—ta)dz , 


it remains well-defined for t 4 0 (Chapter V, n° 24, Theorem 23, integrals 
of “oscillating” functions), hence also for complex f by separating real and 
imaginary parts. In fact (17) is also well-defined for t = 0, but for different 
reasons as shown above in (ii). 

Having said this, suppose first that ¢ > 0 and integrate along the path 
already used for absolutely converging integrals. Setting g(z) = f(z)e(—tz), 
the contribution from the lower half-circle can be obtained by integrating the 
function g(Re™)i Re™ from u = 0 to u = —7. As |f(z)| < M/|z|, where M 
is a constant, 


|g (Re™) iRe™|] = R|f (Re™)|. |e (—Rte™)| < M.exp (2rRtsin u). 


Therefore, the contribution from the lower half-circle is, in modulus, bounded 
above by 


0 T 
u | exp(27Rt sin u)du = u | exp (—psin wu) du, 
0 


= 


where p = 27Rt tends to +00 since t is supposed to be strictly positive. To 
show that this integral tends to 0, first note that the integrated function is 
everywhere < 1 and that it tends to 0 except for u = 0 or 7 since —psinu 
tends to —oo; the (real) Lebesgue theorem of dominated convergence takes 
care of the question since the measure on the set reduces to the points 0 
and a. If the event of a refusal to use it, note first that, for given t > 0 and 
6 > 0, the continuous function being integrated decreasingly converges to a 
limit function continuous on [6,7 — 6], namely 0; convergence is, therefore, 
uniform in such an interval (Chapter V, §2, n° 10, Dini’s theorem), so that 
its contribution to the integral tends to 0, and so is < 6 for large p; as those 
of the two forgotten intervals are < 6 for all p, the total is < 36 for large p, 
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implying the result once again. If, however, you refuse to use Dini’s theorem, 
you can check it by an explicit calculation. For this observe that 


sinu >sind=a, 
is a strictly positive constant in [6,7 — 6], so that 
exp(—psinu) < exp(—ap) 


and uniform convergence on the interval under consideration holds again 
since exp(—ap), which does not depend on u, approaches 0 as p increases 
indefinitely. 

In fact, a somewhat more precise result can be obtained. It is sometimes 
called Jordan’s lemma by the author of a famous Analysis Course and pro- 
fessor at the Ecole Polytechnique from 1876 to 1911 ; given there were about 
two hundred students per year, his students must have provided many ar- 
tillerymen. It suffices to study the integral 


T m/2 
I(R) -| exp(—Rsinu)du = 2 | 
0 0 


and to note that between 0 and 7/2,u/2 < sinu < u. 
mw /2 
I(R) x | exp(—Ru) du =< 1/R. 
0 


immediately follows. 

The previous arguments suppose that ¢ 4 0 and fall apart if ¢ = 0. But 
in this case, we are brought back to the computations of section (i) and for 
example to formula (8), which shows that, one of the following holds: 


2Qnri =o Res(f,a) + 7i Res(f, 00), 
Im(a)>0 


—27i > Res(f,a) — 71 Res(f, 00) 
Im(a)< 0 


(8.18) f(0) = 


where the residue at infinity is given by the relation 
Res(f,oo) = —limzf(z) = —c. 


In conclusion, (10’) and (10”) continue to hold for t 4 0, the case t = 0 being 
a consequence of (18). For example, 


eg, 0 if t>0, 
(8.19) | AO de = wi if t=0, (Im(a) >0) 
y Qnie(—ta) if t <0. 


§3. Some Applications of Cauchy’s Method 75 
Replacing t by —t, we get 
(8.20) fe —a)~'e(tx)dx = 2rit’_e(at) 
by stipulating, like Fejér, that 


(aH1r¢>o, SY2nt=0, =O 2<0; 


We could have spared ourselves all these calculations — but we did not 
want to — by alluding to Theorem 27 of Chapter VII, §6, n° 30: if f isa 
regulated and absolutely integrable function on R, then 


R 
tim [ Fade(ouddy = 5 lle) + fe) 


at every point where f has left and right derivatives. In the present case, it 
can be applied to the function f defined by the right hand sides of (19): it is 
obviously regulated, right and left differentiable everywhere, integrable since 
Im(a) > 0, and finally its Fourier transform is 1/(a— a) as shown by a most 
simple direct calculation. It then remains to check that $ [f(0+) + f(0—)] = 
wh. 


9 — Summation Formulas 


The residue method allows us to prove several summation formulas. Let us 
for example show that if f(z) is an entire function satisfying an inequality 
of the form 


(9.1) f(2)| < M.e™l4| with O<a<1, 

then 

(9.2) gt >(-1" ita) =lim >. 
s1In 77 Zz Z z mr poo bales 


For this, let us consider the meromorphic function*® g(w) = 7f(w)/sin zw. 
Its only singularities are, at worst, simple poles at points n € Z with 
Res(g,n) = (—1)"f(n). For given z ¢ Z, the function 


h(w) = g(w)/(w — z) = rf (w)/(w — 2) sin rw 
has simple poles at w € Z, with 
46 Recall that the only singularities of a meromorphic function on a domain G 


(here, C) are necessarily isolated poles. If f = p/q where q has a simple zero at 
a and p is holomorphic at a, then Res(f,a) = p(a)/q'(a). 
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(9.3) — Res(h,n) = (-1)"f(n)/(n — z) = —(-1)"F(n)/(z — 0), 
and, also a simple pole at z, where 
(9.4) Res(h, z) = g(z). 


If, for given p € N, h(w) is integrated along the rectangle of figurel1, an idea 
already applied in the previous n°, the result is then equal, up to a factor 
2ni, to the difference between g(z) and the partial sum |n| < p of series (2). 
Hence it all amounts to proving that the integral of h tends to 0 as p increases 
indefinitely. For this, find upper bounds for the contributions from each of 
the sides of the rectangle. 


A Vv 
ip 
Cc < B 
VY 
—P p-1 p 
O- O- o- > U 
0 
MN 
> 
D -ip A 
Fig. 9.11. 
On the side BC, 
ie | =e ™ a = em 


and as a result |e” — e~™"”| > e™? —e~"P > e*P /2 for large p, and so, using 
(1), the following is an upper bound 


(9.5) |f(w)/ sin rw| < cye™(2- DP 
where c; does not depend on p. Since also 

jw — 2| > |Im(w — z)| = |p — Im(z)| > p/2 for large p, 
on BC, 


(9.6) |h(w)| < cype™*-” for large p. 
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The integral along BC is, therefore, up to a constant, bounded above by 
(2p + 1)pe™**-YP and tends to 0 since a < 1. A similar argument holds for 
the contribution from side DA. 

Next, let us investigate the contribution from AB. As w = p— 1/2 + iv, 


sinqw = +coshav, 
whence | sin tw| > 1/2e~7!"! and 


|h(w)| < 2e~7!"! | f(p — 1/2 + iv)| /|p—1/2 + iv — 2| < 
< cge™@- VI" / |p — 1/2 — Re(z)| 


for all v. Integrating from v = —p to v = p, the upper bound 


P 
ca |p — 1/2 — Re(z)|~* i eX (@-Diel dy 


—P 


is obtained. It tends to 0 like the first factor since the extended integral over 
R converges for a < 1. 

Therefore, the contribution from AB and obviously that from CD too 
both tend to 0, giving formula (2). The series extended to Z needs to be 
interpreted as the limit of its partial symmetric sums since it may very well 
not converge unconditionally, for instance in the case of the function f(z) = 1. 

In this case, the partial fraction expansion of the function 1/ sin z: 


Oe) ee z—n 3+) a 


Most authors reject the first series or interpret it as the limit of its symmetric 
sums, but in fact taken separately its “positive” and “negative” parts are 
convergent though not absolutely. Indeed, 


1/(z—n) = -1/n + z/n(z —n) = -1/n + O(1/n?) 


for large|n|, so that the sum extended to n > 0 is the sum of an absolutely 
convergent series and of the alternating series }>(—1)"*+/n, and so converges. 
(7’) can also be written as 


with, this time, an absolutely convergent series. 
Replacing z by $(1— 2), 


(9.7”) ey 


2cosmz/2 
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follows. We will use this formula in n° 15. 
Exercise. Deduce from (2) the formulas 


7 Dy (iP rae a, 
sin 7 Zz tn 
n>1 


cosh az cos na 


z+n2’ 


(-t<a<n). 


=1/z+2z5 > (-1) 


n>1 


sinh rz 


Show that the convergence of the first amounts to that of the series }>(—1)” sin 
(na)/n. Can these formulas be obtained from the theory of Fourier series? 


10 — The Gamma Function, The Fourier Transform of ms tas 


and the Hankel Integral 


Formula (8.15) which can also be written 


i (y — a)-"e(xy)dy = (2ni)"2"1e(ax)/(n — 1)!; 


was proved above. It assumes that n is an integer > 2, x € R and that 
Im(a) > 0. We intend to show that more generally 


(10.1) Jo — a) ~*e(xy)dy = (27i)*x*~e(ax)/T'(s) 


for « € R, Re(s) > 1 and Im(a) > 0, where 
+00 
I'(s) =} e ‘t’d*t, Re(s) >0 
0 


is Euler’s function (Chapter V, §7, n° 22) and where d*t = dt/|t|. Since, in 
(1), the function (y—a)~* and that of the right hand side are continuous and 
integrable for Re(s) > 1, this amounts (inversion formula) to showing that 
(y — a)~* is the Fourier transform of the function 

(10.2) v(x) = e(ax)z*~* = exp(2riax)x*"" 


where Im(a) = c > 0. As | exp(27iax)| = exp(—27cxr), Re(s) > 0 is sufficient 
for y € L1(R) to hold. Then, setting w = 27i(y — a), 


+00 
(10.3) Ply) = fevtetar — xy)dx =| exp(—wa)a*d* x 


and so Re(w) = 27 Im(a) > 0. If w were real > 0, the following formula could 
be obtained by a change of variable > x2/w: 
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+00 
(10.4) Ply) = wf e *a*d*« =I(s)w * 
0 


and so (1) would follow. The analyticity with respect to w remains to be 
proved. 

(i) The gamma function. Some of the many properties*’ of this function 
are already known, to begin with the formulas 


(10.5.1) I'(s+1)=sI(s), I(n)=(n—-1)! for n>1. 


This relation immediately shows that I'(s) can be extended analytically to 
all of C, excepting the simple poles at s = 0,—1,..., where 


(10.5.2) Res(I’, —n) = (—1)"/n! 


(Chapter V, §7, n° 25, Example 5); in fact, 


(10.5.3) I'(s) = [ = weet fo e *atd*t = 
=2 \" nis +n) +It(s). 


The integral over (1, +00), '*(s), converges for all s and is an entire function. 
Contrary to the integral over (0,1), the series obtained by integrating term 
by term the exponential series is also convergent for all s. 

Setting 


fala) = (l—a/n)"x* for «<n, 


0 for «>n, 


we get a sequence of functions converging to e~*x* while remaining domi- 


—XLpySs 


nated by e~** (exercise !); hence 
(10.5.4) I(s)= tim f fo(a)d"s =limn!n*®/s(s+1)...(s +n) 


(Chapter V, §7, n° 23, Example 1), a priori for Re(s) > 0. This leads to the 
expansion 


(10.5.5) 1/I'(s j= = gel TG ale s/n)e —s/n 


47 See for example Dieudonné, Calcul infinitésimal (Hermann, 1968), IV.3, IX-4 
to IX-8, Remmert, Funktionentheorie 2, Chap. 2, §2, where the function is de- 
fined by using its infinite product, Freitag and Busam, Funktionentheorie, chap. 
IV, §1 and in particular the exercises, not to mention earlier authors. Entire 
books have been written on it, notably N. Nielsen, Handbuch der Theorie der 
Gammafunktion (Leipzig, 1906, reedit. Chelsea, 1965). 
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of the function I’(s) into an infinite product convergent everywhere, and so 
valid everywhere by analytic extension; 


C =lim(1+...+1/n—logn) = 0,577215664... 


is the Euler constant (Chapter VI, § 2, n° 18). The complement formula (same 
reference) 


(10.5.6) I'(s)'((1-s) =7/sinrs 


is, for example, obtained by comparing the infinite product expansions of 
both sides with that of 


sina7s = 78 II (1 - s*/n”) 


n>1 
(Chap. IV, n° 18). It follows that 
(10.5.7) P(1/2) =n, 
a result that reduces to the integral of exp(—7a?), and that 
(10.5.8) |P(1/2 + it)? = 2/coshnt for teR. 
The duplication formula 
(10.5.9) I'(2s) = n~1/?9?8-17(s)P'(s + 1/2) 


which is often used in analytic number theory will be needed. A very (too?) 
ingenious method*® for obtaining it consists in using a characterization due 
to Helmut Wielandt (1939) of the gamma function by the following two prop- 
erties : 


(a) f is holomorphic on a domain G containing the strip 1 < Re(z) < 2 and 
bounded on it; 
(b) f(s+1) =sf(s) whenever s,5+1€G. 


Property (b) allows us, as in the case of Euler’s function, to first find an 
analytically extension of f to all of C with, at worst, simple poles at integers 
<0 and 


Res(f, —n) = (-1)"f(1)/nt. 


As a result, g(s) = f(s) — f(1)I(s) is an entire function also satisfying g(s + 
1) = sg(s). The entire function h(s) = g(s)g(1 — s) then satisfies h(s + 1) = 


—h(s). 


48 T find it in Freitag-Busam, Chap. IV, §1 and in Remmert 2, Chap. 2, §2. A 
more general, but far less useful, formula can be found in Dieudonné, Calcul 
infinitésimal, IX.4. 
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As f(s) and I'(s) are bounded on 1 < Re(s) < 2, because of their func- 
tional equation, they are bounded at infinity on the entire strip a < Re(s) < 
a+1< 2; the same, therefore, holds for g, and as g does not have poles, 
it is globally bounded on that strip. Hence h(s) is bounded on the strip 
0 < Re(s) <1. 

The pseudo periodicity of h then shows that h is bounded on C, and 
so is constant, and hence is zero since g(1) = f(1) — f(1)(1) = 0. Since 
g(s)g(1 — s) = 0 for all s, g(s) = 0, and so 


qed. 
Having done this, we can return to (10.5.9), which can also be written 


(10.5.10) I'(s) =~ ¥/?28"'1'(s/2)P [(s + 1)/2] , 


and observe that the function f(s) = 2°I'(s/2)I [(s + 1)/2] satisfies Wielandt’s 
assumptions, which is immediate. Hence f(s) = f(1)I°(s) = 27'/?T(s), and 
the formula follows. See Chap. XII, n° 1 also. 


(ii) Fourier transform of e~*x*5—'. To show that (10.4) remains valid for 
Re(w) > 0, it suffices — analytic extension — to check that both sides are 
holomorphic functions of w on this half-plane. Setting 


(10.6) w * = exp [—sL(w)] = |w|~%e7 #8 48) 


where L(w) = log|w| +7Arg(w) is a uniform branch of the pseudo-function 
Log(w) on Re(w) > 0 (§ 2, n° 4, (i)), it is the case of the right hand side. As 
our aim is to find the usual function for w real > 0, we should choose 


|Arg(w)| < 1/2. 


As for function (3), theorem 9 on integrals depending holomorphically on a 
parameter can be applied to it. The only non-obvious condition is the exis- 
tence of a function p(t) € L1(R,) for every compact subset H C {Re(w) > 
0} such that 


|exp(—wt)t*~"| < |px(t)| 


for all w € H and t > 0. But the distance from a compact subset of an open 
set U to the boundary of U is > 0. A is therefore contained in a half-plane 
Re(w) > a with a > 0; then 


lexp(—wt)t*~1| < exp(—at)t®e()-1 = pi (t). 


82 VIII — Cauchy Theory 


This is an integrable function since a > 0 and Re(s) > 0, qed. 
(1) can now be justified. Since, by (3) and (4), 


Aly) =I'(s)w * =I(s) 2ri(y—a)|-* for Re(s) >0, 


this function is of the order of magnitude of y~* at infinity. It is, therefore, 


integrable over R if Re(s) > 1, and as this is also the case of 
pc) = e(ax)z'2! 


which, moreover, is then continuous everywhere, including at x = 0, Fourier’s 
inversion formula shows that 


(10.7) T(s) / [2ri(y — a)]~° e(xy)dy = v8~'e(az) 
for Re(s) > 1 and Im(a) > 0, provided we set | Arg(w)| < 7/2. Choosing 
—n <Are(y—a) <0 and Arg(2mi) = 7/2, 
Arg(w) = Arg [27i(y — a)] = Arg(2mt) + Arg(y — a), and so 
[2mi(y — a)]~* = (2mi)~*(y — a)~* = (y—a)~*/(2mi) 


and (7) can be written 


(10.8) Jo —a) *e(xy)dy = ae as, ‘e(az) for Im(a) >0, Re(s) >1; 


this is the formula generalizing (8.15). Similarly, we get 


(10.8") / (y + a)-*e(—2y) dy = 


where we have to take 0 < Arg(y +a) < a and Arg(—271) = —7/2. 
Replacing a by z and applying Poisson’s summation formula (Chap. VII, 
n° 23), we get 


(10.9) a =a: a = oD ee exp(27inz) for Im(z) > 0 


and Re(s) > 1, a result generalizing (8.16). The reader should not forget to 
check the assumptions allowing the application of the Possion formula to a 
function f: the latter is continuous and the series 7 f(a+n) and > f(y+n) 
converge normally on any compact set. 
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(iii) Hankel’s integral. For x > 0 and a = 1, (8) can also be written as 
e2ria(y-t) 


2ni/E'(s) = / Gago Re(s) > 1. 


Setting 27ix(y—i) = z, we get an integral computed over the vertical Re(z) = 
2rx =a > 0. As dz = 2riaxdy, we finally get that 


(10.10) 2ni/I(s) =| ez “dz, Re(s) >1. 
Re(z)=a>0 


Having said that, and the function z~° being defined on U = C — R_ by 
(10.11) z * = |z|~* exp [—is Arg(z)] with | Arg(z)| <7, 


integral (10) is about a holomorphic function on U. We show that a formula 
holding for all s € C can be obtained by deforming the integration contour. 
This is not the case of (10) since the integral diverges for Re(s) < 1. We use 


Fig. 10.12. 


the above path, along which the integral is zero, and set r and R to be the 
radii of the two circular ares. By (11), |z~*| < |z|- Fe) on U = C— R_ and 
in particular for small or large |z|. Since, moreover, 


le| = eRe(z) < @ 
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on the integration contour, the contribution from the large arcs to the integral 
is O(R!-®e(s)), hence tends to 0 since Re(s) > 1. For the same reason, the 
contribution from the small circular arc is O(r!~R°)), but it does not follow 
that it approaches 0 as r tends to 0; as always, we have to choose between 
convergence at infinity and convergence at 0... 

As R increases indefinitely, the integral along AB approaches the integral 
(10) we started with and the integrals over the large circular arcs tend to 0. 
In view of the directions of integration, 


(0+) 
(10.12) 27i/I(s) = ez “dz (Hankel’s Formula) 


—co 


where the notation, traditional among specialists of special functions,*® de- 
notes the path GF EDC expanded at infinity on both sides of the cut plane 
along the real negative axis. The use of this word, also traditional, suggests 
that if points located over or under R_ are“ infinitely near” in C, from the 
point of view of the values of z~*, they are not near because the argument 
of z~* changes from —zis to +7is when one goes from the lower half-plane 
to the upper half one by crossing R_, . Anyhow, if U = C — R_ is equipped 
with the topology of C, a sequence of points such as a, = —1+i/n, converg- 
ing in C, do not converge in U; neither does the sequence b,, = —1 — i/n. 
Despite appearances, the latter is in no way near the former with respect to 
the topology of U. 

If we wanted to explain all this in somewhat more “modern” terms 
than W&W, we could consider the graph S in C? of the ¢ = Logz = 
log |z| + i Arg(z), i-e.the set of couples (z,¢) € C? such that exp(¢) = z. It is 
for good reason a helicoidal surface similar to the one described in Chapter 
IV, §4 in relation to the pseudo-function Arg(z) and on which a genuine 
function z~* can be defined without ambiguity, namely (z,¢) +> exp(—s¢). 
This graph is connected, but is no longer so if the points for which z € R_ are 
removed. Indeed, the choice of a uniform branch of z~* on C — R_ amounts 
to choosing of the connected components of S deprived of these points. Such 
a component is homeomorphic to C — R_ under the projection (z,¢) > 2, 
but its two “sides”, that are projected onto R_, are in no way near each 
other with respect to the topology of C? since each follows from the other 
under the translation (z,¢) 1 (z,¢ + 277). This type of difficulty appears 
in the computation of numerous integrals, where there are functions whose 
analytic extensions are “many valued” on C, for example the integral of 
(4x3 — goa — g3)!/? which occurs in the theory of elliptic functions and is 
the easiest instance of an integral of an algebraic function, or else integrals 
involving the function log 2, etc. 

Returning to Hankel’s formula, the exact form of the integration con- 
tour is unimportant, and integrating over any path homotopic to the vertical 


49 See in particular Whittaker and Watson, A Course of Modern Analysis, Cam- 
bridge UP, 1902. 
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Re(z) = a > 0 in C— R_ and not only in C would be sufficient, for example 
over the curve that a metallurgist would obtain by bending a rectilinear iron 


< i ' 
> a7 


Fig. 10.13. 


wire of infinite length around the impassible barrier erected along R_. The 
importance of integral (12) lies in the fact that it is well-defined for all s € C, 
because as Re(z) tends to —oo, the function e* tends to 0 quickly enough to 
neutralize the power functions with which it is multiplied. Using theorem 
8, it is not hard to check that the right hand side of (12) is a holomorphic 
function of s and thus does represent the left hand side on all of C. 

R_ could be replaced by any half-lines with initial point 0 located in 
the half-plane Re(z) < 0, provided the chosen uniform branch of z~* is in 
consequence defined. When moving away to infinity along such a cut (and 
not randomly in C)), |e?| = e®°@) with Re(z) < —|z|, so that the factor e” 
tends to 0 sufficiently quickly for the integral to converge for all s. 

If we are going to present pre-modern mathematics, let us show how the 
relation I'(s)I’(1 — s) = m/sin7s can be recovered by integrating along the 
path GFEDC extended to infinity. If the ordinates of the lines GF’ and DC 
are —e and +e, whence z = —t+ie with t > 0, then the approximations 
z * =t *exp(—ims) on GF and z~* = t *exp(izs) on DC follow. In view 
of the direction followed, 


| +f =e | rsentat + er f t-*e'dt. 
GF DC r r 


Relation (10) used above to show that the contribution from the large circular 
arcs tends to 0 if Re(s) > 1 equally shows that that of the small are FED 
tends to 0 if Re(s) < 1; in this case, r can be made to approach 0. At the 
limit, Hankel’s integral becomes 


+00 
2ni/I'(s) = 2% sinns [ e *t ‘dt = 2isinas.r(1—s). 
0 


This gives the complement formula for Re(s) < 1 and hence, by analytic 
extension, in all of C. 
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11 — The Dirichlet problem for the half-plane 


The computations of the previous n° allow the Dirichlet problem*®” solution 
method presented in Chapter VII, §5 for functions on the circle T to be 
transposed to the functions defined on R. We could avoid using the Fourier 
transform and introduce the Poisson transform (11.3) from the start, but 
as I explained in the preface to vol. I, my aim is not to necessarily provide 
readers with the most direct paths to interesting results. In fact, introducing 
the Fourier transform in this very particular situation is neither more nor less 
artificial that using Fourier series in the case of the unit disc; the presence of 
a group is exploited: the group of rotations about 0 in the case of the disc, 
the group of horizontal translations in the case of the half-plane. The method 
could be generalized to heat or wave propagation equations 


(*) dufdt= Au, d’u/dt? = Au 


where u(t, x) is a function on R; x R” with (as well as its partial derivative 
du/dt in the second case) given values for ¢t = 0. The reader will already 
be able to practice on the case n = 1, the basic idea being the one applied 
by Fourier to go from the heat equation on the unit circle to his series: find 
the “simple” solutions of the form f(t)g(a), then try to express the general 
solution as a “continuous sum” of such solutions; the calculation is easy for 
the first equation, but less so for the second one. 

Allowing for notation, in the previous n°, the Fourier transform of the 
function 


y(t) =t{'e(zt), Im(z) >0, Re(s) >0, 
has been shown to be 


P(u) = I'(s) [2ri(u — z)]* . 


On the other hand, we know (Chap. VII, §6, n° 30) that if f,g € L+(R), then 


(*) / f(t)g(t)dt = / Flu)g(u)du; 


this formula is obtained by calculating the double integral 


/ | f(t)g(u)e(—tu)dtdu 


°° Recall that, generally speaking, it consists in constructing a harmonic function 
on an open set with given values on the boundary. The cases of the unit disc 
or of the half-plane cannot give any idea of the difficulty involved in the general 
problem in C, even less so in R” — not to mention generalizations to elliptic 
PDEs. 


§3. Some Applications of Cauchy’s Method 87 


in two different ways. The easiest proof consists in applying the (genuine) 

Lebesgue-Fubini theorem; also with a few acrobatics, we could just consider 

the version proved in Chapter V, n° 33 for Isc functions (separate the real 

and imaginary parts of the functions in question), but it is not worth it. 
Choosing®! g = y, for any function f € L'(R), 


+00 | 
(11.1) | f(u)uste(zu)du = I'(s) [ f(t) [2ri(t — z)]* dt 


for Re(s) > 0 and Im(z) > 0 since these conditions imply g € L1(R). In any 
event, the case s = 1 shows that for every f € L1(R),°? 


+ 1 f f@ 

(11.2’) | f(u)e(zu)du = — dt = F*(z) for Im(z) >0, 

0 271 Jpt—z 
where F'* is defined and holomorphic for Im(z) > 0. The similarity of this 
result with Cauchy’s integral formula will not escape anyone, but it is mislead- 
ing: F'*(z) is not a holomorphic function reducing to fon R. If this formula 
is applied to f(—t), a function whose Fourier transform is f(—u), and if z is 
replaced by —z, then similarly 


271 t-—z 


\ 3 1 f f@ _ 
(11.2”) f(uje(zu)du = dt = —F(z) for ,Im(z) <0 
where F'~ is defined and holomorphic for Im(z) < 0. This leads us to associate 
to every integrable function f on R the function?’ 


he f(uje(zu)du if Im(z) >0 


(11.3) F(z) = f(t)dt _ 
Qri J t—z =) f(uje(zu)du if Im(z) <0 


analogous to the Poisson transform Py introduced in Chapter V, §5 in the 
case of the unit disc [see in particular formula (21.7)]. It is holomorphic 


°! The reader will probably observe that y is a regulated function on R only if 
Re(s) > 1 or s = 1 since the factor zs is not bounded in the neighbourhood of 
0 for Re(s) < 1 or, for Re(s) = 1, s 41, does not approach a limit as x —> +0. 
All these difficulties disappear in Lebesgue theory. 

52 (2°) could be directly obtained by computing the Fourier transform of the func- 
tion equal to e(Cy) for y > 0 and to 0 otherwise, and by applying the general 
formula (*). 

°8 F(t)dt could be replaced by a measure dj(t) with finite total mass, and it is 
precisely by studying such functions that Stieltjes was led to define his integrals. 
Example: write the set of rational numbers as a sequence (uy) and choose the 
measure jt given by f f(t)du(t) = > f(un)/n? for continuous f with compact 
support. The behaviour of the corresponding function 


F(z)= D2 1/n?(un — z) 


in the neighbourhood of the real axis is not obvious. 
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outside the real axis and decomposes into the two functions F* and F’~ on 
the upper and lower half-planes H+ and H~ . These functions being given by 
the Fourier integrals occurring in (3), they extend by continuity to the closed 
half-planes Im(z) > 0 and Im(z) < 0, provided these integrals converge for 
real z, which assumes that f € F1(IR) since f has already been assumed to 
be integrable. It is not in general possible to go from one to the other by 
analytic extension since, as we will see, their limit values on the real axis 
do not coincide. Nonetheless, it is necessary to observe that the left hand 
side of (3) continues to be well-defined in a neighbourhood of z if z does not 
belong to the support S of f, so that (3) defines a holomorphic function on 
the connected open set H* U H~ U (R—S) of C if the open subset R — $ 
of R is not empty; under these circumstances, the analytic extension across 
R—S is possible, which does not prevent the function obtained to suffer a 
discontinuities across S. 


Set 
i 4 1 1 
(14) uley)=FH@)-F-@)= 55 f (2-2) Moe= 
a y 
=2 f Haste, 


where integration is as always over R when it is not specified. (2’) and (2”) 
show that®* 


+OOo 
(11.5) us(,y) = i f (u) exp(—2ruy)e(uxr)du + 
0 a 
+f f(u) exp(+2ruy)e(ux)du = 
= few exp(—2ry|ul) f(u)du. 


As y tends to +0, the function exp(—27y|u|) converges uniformly to 1 on 
every compact set by remaining < 1. Therefore, the most elementary version 
of the dominated convergence theorem shows that 


(11.6) jim, [FT (@ + ty) — F(@— iy)] = 
= jms = [ecw fwau = f(x) 


if f € F'(R), which explains the impossibility of “gluing back” F*+ and F— 
into a single holomorphic function on C. 


°4 These computations have already been used in Chapter VII, § 6, n° 30 to prove 
Fourier’s inversion formula (Theorem 26). 
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Contrary to (3), definition 


1 y 
11.7 = t)dt 
(11.7) ute) = = f geal 
continues to be well-defined for y > 0, provided f is bounded on R. Hence 
if it is not possible to define the Poisson transform F = Pr, uf can then be 
defined on the upper half-plane; this function is harmonic since the relation 


upleyy) = tim sf (Geo = pag) sae = lm (Fale) + Fal) 


n=+oo27t J_, \t-—z t-2Z n=+00 


shows that, on the upper half-plane H*, harmonic functions converge uni- 
formly to uy on all compact subsets (exercise), and that uy is therefore har- 
monic®® (Chapter VII, §5, n° 25, Theorem 21). Relation (6), ice. 


(11.6’) lim up(x,y) = f(x), 


obtained by supposing f € F(R), applies in fact to all continuous and 
bounded functions f on R. Indeed, set 


1 


(11.8) P,(t) 7241) 


Yy -1 
= —_7 = P(t where P(t) = 
ey =v Pty (t) 
for t € R and y > 0. When y —> +0, these functions form, up to countabil- 
ity, a Dirac sequence on R in the sense of Chapter V, §8, n° 27: they are 
continuous, positive with total integral 1 and, for any r > 0, as y tends to 0, 
so does the integral 


| P,(t)dt ya -1 P(t/y)at ef a 
= y y)dt = — 
|t|>r Tr T I r/y ied 


The arguments of Chapter V immediately imply (6’) for continuous and 
bounded f and even 


(11.6”) lim uy(z,y) = = [f(a +0) + f(x — 0) 


1 
D 
for f regulated and bounded on R. 

The function uy is itself continuous and bounded on the closed half-plane 
Im(z) > 0 if f is continuous and bounded on R. Since P,(t) is an even 
function of t, 


(11.9) u(y) = f Ple-tu) fd = f Pylu)f(e—wdu, 


°° It may also be observed that, up to a factor i, 1/(t—z)—1/(t—2) is the imaginary 
part of the holomorphic function 1/(t — z) , and so is harmonic for any t € R, 
and the Laplacian of ur may be calculated by differentiating under the f sign. 
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indeed follows. This is a product of convolutions on R, and since [ Py(u)du = 
1, it proves that 


(11.10) lus(@,y)] SMF ll, 


where the norm is a uniform one on R. On the other hand, for all a € R, 


lus(z,y) — fla] S [Pao |f(@ — u) — f(a)| du. 


For all r > 0, the extended integral over {|u| > r} is, up to a factor 2|| f\|, 
bounded above by the extended integral of P, over the same set, est hence, 
as seen above, is < ¢ for all 2 given any sufficiently small y . In the interval 
jul <r, ja -—u-—al < |x —a|+r, and so |f(a — u) — f(a)| < € for any u 
in this interval provided |a — a| and r are sufficiently small. This integral 
is, therefore, < € since the value of the total integral of P, is 1. Finally, 
lur(x, y) — f(a)| < 2e for sufficiently small y and |x — al, ged. 

Exercise. If f(x) is uniformly continuous on R, the function x +> uf(z, y) 
converges uniformly to f(a) on R as y —> +0. 

The function uy, therefore, solves the Dirichlet problem for the half-plane 
and for continuous and bounded functions on R: find a harmonic function on 
an open set G with given values on the boundary. More precisely, it is one of 
the possible solutions since any function of the form u;(z,y) + ay, where a 
is a constant, is also a solution of the problem. Uniqueness holds in the case 
of the unit disc D considered in Chapter VII, §5 (Theorem 22) because, if a 
continuous function on the closed disc and harmonic on the interior is zero 
on the boundary, then a compactness argument and the maximum principle 
show that it is identically zero. But the half-plane is not compact. In fact, 
the map 


z->C=(z2-i)/(2+i), 


is a conformal representation of the half-plane Im(z) > 0 on the unit disc D : 
\¢| < 1; it, therefore, transforms every holomorphic (resp. harmonic) function 
on the former to a holomorphic (resp. harmonic) function on the latter. The 
real axis y = 0 is homeomorphically mapped onto the boundary |¢| = 1 with 
the point 1 removed, which could be obtained by making z approach infinity. 
Hence, for continuous functions f(x) on R, the Dirichlet problem in the half- 
plane, translated into the language of the unit disc amounts to finding a 
continuous function on D — {1}, harmonic on D and with given values on 
T — {1}, which allows every kind of behaviour in the neighbourhood of 1. In 
fact, 


y = Im(z) = (1 — |¢|?)/|1 - ¢/? 


is easily computed to be the Poisson disc functional; it is harmonic on D, 
continuous on D — {1} and zero on T — {1}, but takes every positive value in 
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all neighbourhoods of 1, so that it does not contradict Dirichlet’s principle 
for the unit disc... 
For f integrable, apart from the function, 


us(z,y) = Ft (z)—F-(), 
the function 


(11.11) ve(a,y) = —i [FT (2) + F (2)] 

may be introduced; if f is real, clearly F~(z) = —F'*(z), so that usand vf 
are, up to a factor of 2, the real and imaginary parts of F*(z) on the upper 
half-plane. Changing the computations leading to (4) and (5), 


(11.12) oglu) == fg eee em - | ’ (at thdt 


t—2)?+y? T 
= =i f e(cu)sen(u) exp(—2zy|ul) f(u)du 


follows. Let us investigate the limit behaviour of this function when y > 0 
approaches 0, but by supposing now that f € F1(R). As done above for uf, 
it is possible to pass to the limit under the f sign in the Fourier integral, and 
so 


(11.13) ve us(2,y) = =i f e(ewsen(u) flujau, 


Let us now consider the first integral (12), which can also be written 


t ae ee 
= | ——~f(t)dt= ——~ g(t)dt/t 
ru(O.u) = f raloat= [oe paltat/ 
for « = 0, where g(t) = f(t) — f(—t). If f is differentiable at the origin, then 


F(t) = FO) + f/OE+ oft), F(—t) = F(0) — f'(O)t + off) 


and, therefore, g(t) = 2f’(0)t + o(t); hence the function g(t)/t is integrable 
in the neighbourhood of 0, as well as at infinity like f. As y approaches 
0, the function t?/(t? + y?) approaches 1 while always remaining < 1. The 
elementary version of the dominated convergence theorem can, therefore, be 
applied again, which shows that 


lim-r0y(0,u) =f a(at/t = lim is f(t)dt/t. 


Following traditions, set 


+00 
(11.14) p.v / f(t)dt/t = lim f(t)dt/t. 
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This is the Cauchy principal value for integrable functions on R and differ- 
entiable at the origin, which an essential condition. We thus finally get the 
formula 


(11.15) pv. i: f(t)dt/t = —mi sgn(u) f(u)du, 


which at least holds under the following conditions: f is differentiable at the 
origin and belongs to F(R). 

It is in particular the case if f € S(IR), the Schwartz space. As a function 
of f, the left hand side T(f) can immediately be checked to be a tempered 
distribution (Chapter VII, §6, n° 32), denoted v.p.1/t by some; indeed, 

niin f [fll au=- f [flu (1+ anu2)|. (1+ 422) du < 

< sup Flu) (1+ 4n?u?)| ; 
Now, (1+ 47?u?) f(u) is the Fourier transform of f(t) — f”(t), and hence for 
all u is bounded above by the integral 
fro — f"()|dt= 7 (1+) |f()-f’" |. +t?) "dt < 


< m.sup (1+) |f()— f’"OI < 
< m.sup (1+) |f(0)| + m-sup (1+ 2) |f”()] < Nol) 


up to a constant factor, and where, as in Chapter VII, §6, eq. (32.3), the 
topology of S(R) is defined by the seminorms 


N,(f) = D> sup |r f (a). 


Dar 


This proves that f +» T(f) is continuous with respect to the topology of 
S(R). 

The Fourier transform of the tempered distribution T is, by definition, the 
distribution T(f) = T(f); therefore, the significance of the formula obtained 
is that the Fourier transform of the distribution v.p.1/t is the distribution 
—ni.sgn(u), which is in fact a function since, in distribution theory, a function 
y(u) is always identified with the distribution f + { f(u)y(u)du. Conversely, 
the Fourier transform of the function —7i.sgn(w), a Fourier transform that is 
not well-defined in classical theory as the function sgn(u) is not integrable, 
is v.p.1/t. This theory has given rise to generalizations in several dimensions 
that play an important role in some aspects of the theory of partial differential 
equations. 
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When f is bounded on R without being integrable, the function 


1 t 
F(z) = and fod = lim F,,(z) 
is not well-defined in general. uy(x,y) can indeed be defined: if f is real, it 
is, up to a factor of 2, the real part of the non-existent function F(z); but its 
imaginary part is defined by a divergent integral (12). However, if the integral 
that should define F is integrated term by term with respect to z, we now 
get a function 


jie, eS / HO fzl : ) Aloe Im(z) £0 


207 (t— z)° 2Qnri J dz \t—z 


defined and holomorphic for Im(z) 4 0 and satisfying G(z) = F’(z) when 
F exists. Failing to use (3’) in order to define the function F' that we are 
looking for, a primitive for G on Im(z) > 0 could perhaps instead be used as 
a substitute. To simplify, f will be assumed to be real in the rest of this n°. 

Let us first prove an important result which could have been another 
corollary of Theorem 3 in n° 3 related to the existence of primitives of holo- 
morphic functions: 


Theorem 10. On a simply connected domain, a real harmonic function is 
the real part of a holomorphic function and is unique up to the addition of a 
pure imaginary constant. 


Uniqueness is obvious. Let wu be a harmonic function; suppose there is a 
holomorphic function f = u+iv such that u = Re f. Then 


fi =Dyut+iDyv and Div =—Dou, 


and so f’ = D,u —iD2u. Conversely, starting with a harmonic function u, 
Laplace’s equation says precisely that D,u — iD 2u is holomorphic. Now, on 
a simply connected domain, this function has a primitive 


(*) (2) =f (Drug) ~ Dawe), 


where integration is along an arbitrary path connecting a fixed point a to the 
point z in G. Setting f = p+iq, 


a = Dp = iDop = Diu = iDou, 


so that the derivatives of u and p are identical. Hence u = p+ c, where c is 
a real constant and the function f(z) +c solves the problem, qed. 

Coming back to the construction of a primitive for G, we see that a 
holomorphic function F; exists on the half-plane H* : Im(z) > 0 such that 


(11.17) up(njy) =F @)+ he): 
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Differentiate (17) with respect to x. For holomorphic functions, differentiating 
with respect to x is the same as differentiating with respect to z. Thus 


(1.18) A@+AG) = fuew=s 5 | (2-3) swe 


~ sf (a “> (it =) f(t)dt = G(z) + G@) 


provided — an easy application of Theorem 9 of n° 7 — differentiation under 
the f{ sign with respect to x in (7) is justified. As holomorphic functions F/ 
and G have the same real parts, 


Gz) = Fi(z) + ia, 
where a is a real constant; the function 
F(z) = Fy(z) + iaz 


is, therefore, a primitive for G on Im(z) > 0. Since a is real, 


Ft(z)+ F(z) =uf(a,y) — 2ay. 
Denoting by F(z) the function equal to F'*(z) for Im(z) > 0 and to —F'+(z) 
for Im(z) < 0, the following result is finally obtained: 


Theorem 11. For any continuous and bounded function f on R, there is a 
function F' defined and holomorphic on C —R such that 


lim [F(a + iy) — F(a —iy)| = f(x) for all a ER. 
y=0+ 


12 — The Complex Fourier Transform 


(i) Generalities. Given a function on R that will be denoted f(t) for reasons 
specified later, the function 


(12.1) f(z)= fetafeoat= f e(tx)exp(—2nty) fat 


is called the (inverse...) complex Fourier transform of f. This assumes that 
the integral converges absolutely for a non-empty set of values of z, which for 
example excludes the function exp(7t?). It also excludes functions for which 
(1) only converges for real z, for example rational functions, since in this 
context, f(z) is expected to be defined and holomorphic on an open subset 
of C. 

Exercise 1. Show that for f(t) = exp(—7t2) the integral converges for all 
z and is equal to exp(—72?). 
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Formula 
+00 
d t®~1e(zt)dt = '(s)(—2miz)"*, Re(s)>0, Im(z)>0, 
0 


which amounts to re-writing relations (10.2) to (10.4) differently, is another 
example; here the function f is i 
As |e(tz)| = exp(—2zty) and 


exp(y) < exp(a’) +exp(b’') if a’ <y<J'. 


If (1) converges absolutely for y = a’ and y = Db’ > a’, it converges for 
a’ <y <0’. The set of values of y such that 


(12.2) / 


is, therefore, an interval J = (a,b) of a priori arbitrary nature; f(z) is defined 
on the horizontal strip B: y € I. For any compact interval I’ = [a’,b/] C 
I, the function being integrated is dominated by the integrable function®® 
|f(t)|(e7 277 * + e-?7"t) on the closed strip Im(z) € I’ since a’,b’ € I. In 
other words, integral (1) converges normally®” on the closed strip B’: y € I’. 
As a result (theorem 9 of n° 7), 


fo exp(—2mty)dt < +00 


(1) ff is continuous and bounded on B’, and so is continuous but not neces- 
sarily bounded on B since J is the union of the intervals I’, 

(2) f is holomorphic on the open strip y €]a’,b/[, hence on the interior 
a<y <b of the strip B which is the union of these open strips, 

(3) the derivatives of f can be computed by differentiating under the { sign. 


To make sure that condition (2) is satisfied in a given open interval I = 
Ja, b[, it suffices to suppose that 


(12.3) a<y<b= sup Feo] exp(—2mty) < +00. 
teR 


Indeed, if this condition holds, for given y, a’ and b’ can be chosen so that 
a<a’<y<0U' <b. As (3) is satisfied for y= a’ and y = VU’, for large |t| , 


f(t) = Olexp(2na't)] , f(t) =O [exp(2n0't)] , 


and so 


°6 Recall that “integrable” means “absolutely integrable”, except in very rare cases 
when it is specified to be otherwise. 

°” An integral f f(x,y)du(zx), defined for y € E, converges normally on A C E if 
there is a p-integrable positive function p(x) such that |f(x,y)| < pa(x) for 
all y € A and all x. This is clearly analogous to the normal convergence of a 
sequence of functions. 
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|F(@)] exp(—2mty) < Mexp [2n (a! — y) 


f()| exp(—2nty) < Mexp [2m (oy) 4], 


As a’ — y < 0, the first relation shows that f(t) exp(—2mty) approaches 0 
exponentially as t tends to +00; since b’ — y > 0, the second shows that the 
same is true as ¢ tends to —oo; this is more than needed to ensure (2). This 
argument also shows that if f satisfies (3), the same holds for t” f(t) for all 
neN. 


Setting z = x+iy, the function «+> f(x+iy) is the usual inverse Fourier 
transform of f(t)e(ity). If 


(12.4) fire +iy)|dx <+oo forall yet 


and if f is continuous,°® the Fourier inversion formula applies (Chap. VII, 
n° 30, theorem 26) and 


f(te(ity) = ? fa + iy)e(—te)der, 


in other words, 
(12.5) f(t) -| f(z)e(—tz)dz forall ye, 
Re(z)=y 


which is an integral a la Cauchy along the unbounded path t + ¢ + iy. 

Finding conditions ensuring (4) and hence (5) is easy. Indeed, if the deriva- 
tives of order < p of a function y of class C? are integrable over R, then 
P(v) = o(|v|~”) is known to hold at infinity (Chap. VII, n° 31, lemma 2); 
~ is, therefore, integrable, if p > 2. The method used then (integration by 
parts) applies here: as 27ize(tz) is the derivative of e(tz), 


“fF e(tz)dt = [ fer 


if the function f’(t)e(tz) is Peas Iterating the computation, we conclude 
that if f(” exists on if f(") (t)e(tz) is integrable for y € I and all r < p— 
other words if the fo s satisfy the same assumption as f —, then the ae 
zP f(z) is, like f(z), bounded on every strip of finite width contained in B. 
When p > 2, this is enough to prove (4) since in this case, f(a+iy) = 0(a~?) 
at infinity. 


amizf(2) = f(t)e 


°8 When Lebesgue theory is available, this assumption is superfluous: if the Fourier 
transform of an integrable function is integrable, then the given function is almost 
everywhere equal to a continuous function for which the inversion formula holds 
everywhere. 
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Exercise 3. Supposing that f satisfies the conditions just proved, show di- 
rectly that integral (5) is independent of y € B. (Integrate along a horizontal 
rectangle in B). 

We could also use Dirichlet’s theorem (Chap. VII, n° 30, theorem 27) and 
show that 


4 sd 1]; \ Lf 
tim [ f(z)e(-tz)dz = 5 [fe +0) + f(t — 0) 


for all y € I if a is right and left differentiable everywhere; the extended 
integral is over the interval |x| < N of the horizontal Im(z) = y. 


(ii) A Paley-Wiener theorem. One of the problems of the theory is char- 
acterizing the complex Fourier transforms of functions f of a given category. 
The simplest result is related to the space D(R) of C® functions with com- 
pact support in R. In this case, f is defined for all z € C, and hence is an 
entire function, and if f vanishes outside a compact interval [a, —a], then 


lf(z)| < om fo exp(—2nty)dt . 


—a 


The exponential is bounded above for all t by exp(27a|y|) on the integration 
interval, and so 


f(z) =O (crete! on C. 


We saw that if f is replaced by its derivative of order n, then the function 
f(z) is replaced by (—2ziz)" f(z). Thus 


(PW) 2" f(z) =O (c?reln! on C 


for all n € N. Conversely : 


Theorem 12 (Paley-Wiener). Let y be an entire function. The following 
two conditions are equivalent : 


(i) There is a number a > 0 such that y satisfies (PW) for all n; 
(ii) y is the complex Fourier transform of a C® function vanishing outside 
[—a, a 3 


It suffices to prove that (i)==>(ii). The function 2” f(z) being bounded on 
every horizontal and in particular on R, the Fourier transform 


fit) = ‘| f(a)e(—tax)dx 


of f on R is well-defined and is C°: this can be seen by differentiating under 
the { sign. To show that it is zero for t > a, integrate f(z)e(tz) along the 
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path consisting of the interval (—R, R) followed by the upper half-circle; the 
result is zero since the function integrated is holomorphic everywhere. By 
(PW), for all n, there is an upper bound 


|f (z)e(tz)| = |f(z)|exp(—2aty) < eR exp x(a — t)y)| 


over the half-circle for y > 0. For t > a, the exponential is < 1; the integral 
along the half-circle is then O(R'~”) for all n and tends to 0. For t < —a, 
replace the upper half-circle by the lower one. Then f (t) = 0 for |t| >a. 

It remains to check that f(z) = f f(t)e(—tz)dt. As the Fourier transform 
of f(x) is in D(R), the inversion formula shows that f is the inverse Fourier 
transform of - on R. The complex Fourier transform of f being holomorphic 
everywhere and equal to F' on R, the analytic extension principle (Chap. II, 
n° 20) shows they are both identical on C, qed. 

Exercise 4. Let S be the Schwartz space, i.e. the set of C° functions 
y(t), t € R, such that all functions t?y“(t) are bounded on R (Chap. VII, 
§6, n° 31). Let S; be the set of the y € S that are zero for t < 0, so that 
gy’ (0) = 0 for all n. (i) Show that the complex Fourier transform f of any 
y € Sx is defined and holomorphic for y > 0 and that, for all p,p € N, the 
function z? f(z) is bounded on the half-plane Im(z) > 0. (ii) Conversely, 
let f be a holomorphic function on y > 0 satisfying this condition. Show 
that, for any y > 0, the function x +> f(a +iy) is in the space S and the the 
integral of f(z)e(—tz) along the horizontal Im(z) = y does not depend on 
y. Denoting this integral y(t), show that y € S; and that f is its complex 
Fourier transform. 


(iii) Holomorphic functions integrable over a strip. Let 
I =ja,b[c R 


be an open interval, p(y) a continuous function with values > 0 on J, and B 
the open horizontal strip Im(z) € I. Let f be a holomorphic function on B; 
suppose that 


(12.6) Jf \e@lowam(e) < +00 


for any closed horizontal strip of finite width B’ C B.°® We show that, under 
these conditions, f is a complex Fourier transform. The proof uses the relation 
shown in n° 4, (iv) between compact convergence and mean convergence, i.e 
in the sense of the L! norm. In what follows, we will use the notation 


ply)dm(z) = du(z). 


°° As inequality 0 <_m < p(y) <M < +00 holds in every compact set contained 
in I, condition (6) does not really involve p. We will see in the next section that 
this is no longer the case if B’ is replaced in (6) by the strip B. This is why I 
introduce a seemingly superfluous function p(y) here. 
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Lemma. Condition (6) holds for any closed horizontal strip B' C B of 
finite width if and only if the series )* f(z+n) converges normally on every 
compact subset of B. 


Suppose that (6) holds and that K C B is a compact set; normal conver- 
gence on a compact set being a local property (BL), K may be assumed to 
be contained in the interior G of a compact rectangle K’ C B defined by 


K':m<a<m4+1, ad<y<t' 


where [a’, b’| C I is compact and where m € R. (4.16) can then be applied to 
K and G, giving an upper bound 


(12.7) ees ff 1A w)|du(w) forall zek, 


with a constant M independent of f. Replacing z+> f(z) by z+ f(z+n), 
we get 


(12.8) lfe+n)| <M / |. LF (w + n)| du(w) 


=u ff sw w)| du(w), 


where K’+n is the image of K’ under the translation w > w-+n, a translation 
that leaves the measure js invariant. For any z € K, the series )> f(z +7) is, 
therefore, up to a factor M, dominated by the numerical series 


(12.9) ET healt w)| du(w y= ff in w)| du(w) < +00. 


Hence the series }> f(z +) is normally convergent on K if integral (6) is 
finite. Conversely, this condition implies (6) since if there is a convergent 
numerical series u, such that |f(z+7)| < un for all z € K’ and all n, then 


[f,wenaner=> ff, 
=> ff let f(z+n)|du(z) < w(K’) So un, 


qed. 
Exercise 5. We replace |f(z)| by |f(z)|? in condition (6) for a given real 
number p > 1. Using (4.16) for the exponent p, show that the series 


[P+ nf 


converge normally on all compact subsets and that 
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_ \|P 
ff \F@+ an] dx < +00 


for all y €Ja, b[ and all r € N. Converse? 

Returning to the case p = 1, this result sends us back to Chap. VII, n° 29 
where it was shown how to deduce Fourier’s inversion formula from Poisson’s 
summation formula. Consider the function «4 f(#+ iy) for given y € I. As 
the series }> f(z +m) converges normally on every compact subset,°° 


(12.10) fire + iy)| dx < +00 


for all y €]a, b[, which allows the Fourier transform 


(12.11) [re + iy)e(—ta)dx = F(t, y) 


to be defined; it is a continuous function of t. 
Let us show that there is a function f(t) on R such that 


(12.12) F(t,y) = f(t)e(ity). 


Formula (11), multiplied by e(—ity) = exp(2aty), can also be written 


P(t,y)e(—ity) = f fle)e(-t2)de. 


This is the Cauchy integral along the horizontal Im(z) = y, and it all amounts 
to showing that it is independent of y. To compare its values at y’ and y”, 
let us integrate over the rectangle ABCD bounded by the horizontals AB : 
Im(z) = y’ and DC : Im(z) = y” and the verticals DA : Re(z) = —nand BC: 
Re(z) = n; by Cauchy, the result is zero. | f(z)e(—tz)| = | f(n+iy)| exp(2zty) 
on BOC; as the series }> f(z +n) converges normally on the compact interval 
[y’, y”| of the imaginary axis, the function y +> f(n+iy) converges uniformly 
to 0 on [y’, y”] as n increases. At the limit, the contributions from the vertical 
sides are, therefore, zero. This gives the expected result. Hence, there is a 
relation 


(12.13) [ie + iy)e(—ta)dx = f(t)e(ity) 


6° Applied to the double integral (6), the version of the Lebesgue-Fubini theorem 
for Isc positive functions (Chap. V, n° 33, theorem 31), only states that the 
function y+> f | f(a + iy)|dz, with values < +00 (large inequality), is integrable 
as a lsc function, and so is finite“ almost everywhere”; but it could very well be 
infinite for some values of y. The fact that it is finite everywhere, with no zero 
measure exception, is one of the many miracles of the theory of holomorphic 
functions: all the seemingly pathological phenomena of the theory of integration 
disappear. This meta-theorem useful for making conjectures on what is going on, 
does not exempt from giving proper proofs. 
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with a function 


(12.14) f(t) = f $(a)e(—tz dx 


independent of y and that can serve as the Fourier transform of f. 

Let us show that the Fourier inversion formula can be applied to (11). 
Since the series > f(z +n) converges, f(a + iy) is integrable at x and 
approaches 0 as |a| increases indefinitely; (11) can, therefore, be computed 
by integrating by parts. f being holomorphic, Df = f’ and the usual com- 
putation shows that 


(12.15) / f' (@ + iy)e(—tx)dax = (—2rit)" F(t, y) . 
It follows that for all y € I and all r, 
(12.16) F(t,y) =O(|t|-") as |t] + +00, 


a condition more than sufficient to justify the use of the inversion formula. 
In view of (12), it can be written 


= / f(t)e(tz)dt 


and proves that f is the complex Fourier transform of f ‘ 

Besides, (16) shows that 5+ |F'(n, y)| < +00 for all y, which allows Pois- 
son’s summation formula to be applied (Chap. V, n° 27, theorem 24) to 
xr» f(x + iy); in view of (12), it can be written 


Yi fet+n) =o Fnje(nz 


Ultimately, the following formulas are obtained: 


247 F(t) = z)je(—tz)dz, 
(12.17) fig= ff feel-ts) 
(12.18) f(z)= ‘| f(te(tz)dt 

(12.19) So f(ztn) = So f(nje(nz). 


They obviously assume that a < y < b, a condition that, as seen earlier, 
makes the integrals absolutely convergent. 


(iv) Holomorphic functions integrable over a half-plane. Let P be the half- 


plane Im(z) > 0. In what precedes, take J =]0,-++oo[ and B = P and instead 
of (6), impose the stronger condition 


(12.20) [flo ara < +20 
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on the functions f(z). The results of section (iii) apply. On the other hand, 
(13) and (16) show that, for any y € J and any r EN, 


(12.31) f(t) =O [|t|-" exp(2aty)] as |t] —> +00; 


this relation seemingly better than (3) suffices to ensure convergence of (18) 
for y > 0. Relation (15) also shows that 


Lf] exp(—2nty) < f [fe +iy)| ax 


and so 


+00 
[ 


for all t. The function - is, therefore, zero for the values of t for which the 
integral [ exp(—2rty)p(y)dy is divergent. For example, if 


flO|exr(—2et\otundy < ff Is e+ in| eluddedy < +00 


ply) =y 


with k real but not necessarily an integer as will now be supposed, which is 
an important case in the theory of automorphic functions, then 


(12.22) f@~ 40> i. exp(—2nty)y"~ "dy < +00. 
0 


The convergence of the integral requires k > 1 and t > 0, which shows in 
particular that, for k < 1, the only holomorphic solution of (20) is f(z) = 0. 

There still remains to be shown that non-zero solutions of (20) effectively 
exist for k > 1. To obtain a holomorphic function on P, let us try 


(12.23) g(z) =(z-w)? 


with Im(w) > 0. Setting w = u + iv, the change of variable x = u+ €(y+v) 
shows that 


[ise +iniae= f [e-w)? ++ oP]? ae = 
1—p (¢2 —p/2 
= furore +1 a. 

This result is finite if p > 1. It remains to check that the integral 

+oo 

| (yt+v) Py" "dy, v>0, 

0 

converges. This suppose that at y = 0, k > 1 and that at infinity, p > k, 


which implies the condition p > 1 that has already been found. Hence (20) 
has non-zero solutions for k > 1, but none for k < 1. Summarizing: 
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Theorem 13. Let k be a real number. The space Hi(P) of holomorphic 
functions such that 


(12.24) i |f(z)| y* *dady < +00 
Im(z)>0 


does not reduce to {0} if and only if k > 1. Any holomorphic solution of (24) 
is the complex Fourier transform of a continuous function f(t) which is zero 
fort <0. Formulas (17), (18) and (19) hold. 


Exercise 6. Calculate the function f(t) corresponding to (23). p need 
not be assumed to be an integer nor a real since function (24) has uniform 
branches on y > 0. 

To conclude this section, note that, while we have obtained important 
properties of the functions f , we have not characterized them as we did in 
the Paley-Wiener theorem; as far as I know — given the flood of publications 
since the last fifty years, it is necessary to be prudent —, the answer to this 
question will never be known. It is, however, fully known if the functions f(z) 
are taken to be square integrable on the half-plane: these are complex Fourier 
transforms of the functions f(t) which are zero for t < 0 and for which 


+00 2 
: Fo] ti-*dt < +00; 


but these are now functions on the Lebesgue L? space and the result cannot 
be obtained without the help of the whole theory of Fourier transforms on 
L?. This topic will be presented in Chap. XI. 


13 — The Mellin Transform 


(i) Questions of convergence. To obtain Paley-Wiener type theorems that hold 
for meromorphic functions rather than holomorphic ones as in the previous 
n°— an important problem in analytic number theory for example -, it is 
useful to reformulate the complex Fourier transform. If the change of variable 
exp(2z7t) = wis carried out in integral (12.1) which it is defined by, then u > 0 
and 


Qndt = du/u=d*u. 
ene) 


Setting iz = s, we get e(tz) =u’ and f(t) = F(e2™*), and so 


+00 
In f(z) = | F(uju'd*u. 


We often come across integrals of this type; they fall within the general 
framework of the Mellin transform, which associates the function 
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+00 
(13.1) I;(s) -/ f(a)a*d* x , 


where x* = exp(s log ) is the real, positive function of Chap. IV for real s, to 
“any” function®! f(x) defined for x > 0. The notation recalls the fact that 


(13.2) Iy(s)=I(s) if f(z)=e%. 


As I';(is) is the complex Fourier transform of t ++ 2mf(e?"'), the general 
statements of n° 12, (i) are easily translated. 
In conformity to a long tradition, setting 


s=o-+it, 


absolute convergence of (1) only depends on Re(s) = a. Clearly, I; is defined 
on a strip of the plane of the form Re(s) € I, where J is a priori an arbitrary 
interval; it is holomorphic on the interior of this strip and bounded on any 
closed vertical strip of finite width where it is defined. Theorem 9 of n° 7 
shows that its derivatives are given by 


+00 
(13.3) re) = : f(x) log” w.a°d*a 
0 


for all n € N. The logarithmic factor does not destroy convergence: since we 
are in the interior of the convergence strip, the integral [ f(x)a*d*x indeed 
converges at s = a and s = b, where a < Re(s) < b, and it is sufficient to 
observe that 


x* =o(x") as x tends to 0, 


(logz)"x* = o(x’) as x tends to +00 


to justify the result, which is anyhow guaranteed by theorem 9 of n° 7. 

Convergence of the Mellin integral in the neighbourhood of 0 is ensured 
for Re(s) > 0 if f is bounded in the neighbourhood of 0, though it converges 
at infinity if f is integrable at infinity with respect to dx and if the function 
x*—' is bounded at infinity, i.e. for Re(s) < 1. The strip where the function I’; 
is defined is, therefore, at least 0 < Re(s) < 1 if f is bounded and integrable 
with respect to x over R,. 

As I’;(is) is the value of the complex Fourier transform g(z) of 27 f(e 
g(u) at z = 7s, and as there is sometimes an inversion formula 


sila) = 


g(u) = | oe 


in horizontal strip where g is defined, a similar result on Mellin transforms 
will presumably follow by making suitable assumptions.; variable changes 


°! In practice, f(x) is regulated and almost always continuous for x > 0. 
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z = is (and so dz = ids) and e?™“ = x as above are sufficient to obtain it. 


The corresponding formula can be written 
(13.4) Qnif(x) = i Iy(s)x-*ds, 
Re(s)=o 


where integration is over a vertical t +> o + it located in the convergence 
strip of the integral defining I's. This is the Mellin inversion formula.. Like 
the Fourier inversion formula it is equivalent to, it holds under the following 
assumptions, which are merely translations of theorem 26 of Chap. VII, n° 30: 


(a) the function f is continuous for x > 0, 
(b) f|f(ax)x|d*x < +00 for Re(s) =o, 
(c) Integral (4) is absolutely convergent. 


For the Mellin version of Paley-Wiener, a given function y(s) has to be shown 
to be a Mellin transform. The problem is the same: suppose y to be holo- 
morphic on a strip a < Re(s) < 6 and define a function f using (4), where I's 
is replaced by y and where a < o < b. If for a given o, f and y satisfy con- 
ditions (a), (b) and (c), then conversely y(s) = f f(x)a*d*x for Re(s) = o. 
Again, this is only a translation of Fourier’s inversion formula. In practice, 
the given function y decreases at a sufficiently rapidly at infinity for integral 
(4) to be independent of o €Ja, bf. 

The most frequent tendency is to move the vertical over which integration 
is performed to regions where, like Euler’s function, I's is only defined by an- 
alytic extension; condition (b) is no longer satisfied and formula (4) becomes 
false in general. To rectify it, we take into account of the residues of I's at 
poles encountered while moving the integration vertical, initially located in 
the region where condition (b) holds, to a vertical where it no longer is. Sec- 
tion (iv) of this n°, and more so chapter XII, will explain this fundamental 
point. 


(ii) Analytic extension of a Mellin transform. In practice, often the aim 
is to show that the function I’, can, like Euler’s function, be analytically 
extended beyond the integral’s domain of convergence. This question is de- 
termined by the behaviour of f at x in the neighbourhood of 0 or very large 
because functions the transformation is applied to, in practice are, as will be 
assumed, at least continuous for x > 0. Powerful results can be obtained from 
simple assumptions on the asymptotic behaviour of f in the neighbourhood 
of 0 and infinity. 

First consider the integral 


(13.5’) I; (s) =| f(x)a*d* x 


(the choice of the limit 1 is convenient but not essential) and suppose that 
the function f has a bounded expansion 
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(13.6’) f(a) = aya"! +... + a,2"" + O(a") 


in the neighbourhood of 0, with real exponents uy <... << Un < Un4i. AS 
1 
| vd*x=1/s if Re(s) >0, 
0 


integral (5’) converges for Re(s) > —u; and, is equal to 


1 
13. oS i O (a""+1) a d* x 
(13.7) PaNeErag a) 
on this strip. The > is a rational function whose poles and residues are 
prominently displayed, though the integral in the last term converges and is 
holomorphic for Re(s) > —tn41. (5’) can, therefore, be defined by analytic 
extension on this half-plane. If f admits an unbounded asymptotic expan- 
sion® of the form 


(13.8) 
f(a) a aynu"", 2 —-0 with un <Unii and limun = +00, 


function (5’) can be analytically extended to all of C, with simple poles 
and residues equal to a, at all points —u,. If for example f(x) is C° on 
R,, with the origin included, then by MacLaurin’s formula [Chap. V, n° 18, 
eq. (18.11)], there is an unbounded asymptotic expansion 


f(x) D> f™ (0)2"/n! 


Function (5’), a priori defined for Re(s) > 0, can, therefore, be extended to C 
with the points 0, —-1,—2,... removed, points where it has simple poles and 
residues equal to the corresponding numbers f(")(0)/n!. This has already 
been seen for f(x) = e~* in relation to Euler’s I’ function; in this case, the 
MacLaurin series can even be integrated term by term over ]0, 1], which gives 
an expansion 


TP; (8) = 32 F™O)/nl(s +n) 


with a convergent series for all s. The same holds for any analytic function 
f in the neighbourhood of 0 provided the radius of convergence R of its 
MacLaurin series is > 1. If R < 1, decompose R*%. at (0,a) and (a,+00) with 


6 As a general rule, formula f(x) & S>un(x) means that (i) un41(x) = 0 [un(z)] 
in the neighbourhood of the point considered, (ii) f(z) = ui(a) +... + un(x) + 
O [un+1(x)] for any n. This does not in any way imply that the series )> un(x) 
converges to f(x); it can in fact diverge and that is often the case in practice, 
in particular for the Taylor series of a non-analytic C™ function. See Chap. VI, 
n° 10. 
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a < R in order to integrate the power series term by term over (0,a); the 
contribution from ]0,a] is then equal to >> f(”)(0)a"t/n!(s +n), a series 
that converges even better than the power series of f at x = a since |s + n| 
increases indefinitely. The residue of the function at s = —n can be immedi- 
ately calculated since a"+* = 1 at this point; Hence we again get f(”)(0)/n!. 
This result is, as it should be, independent of the chosen point a. 

Through a change of of variable x +> 1/x, which leaves the measure d*x 
invariant, the integral 


+00 
(13.5”) IF (s) = / f(a)ard*x 
1 
reduces to the preceding case. However, 
+00 
| ad*x=-1/s if Re(s) <0. 
1 


Hence if there is a bounded expansion 
(13.6”) f(a) = ba +... + bye?” +O (a) 


at infinity, with vy < ... < U, < Un4i, integral (5”), a priori defined and 
holomorphic on Re(s) < vi, can be analytically extended to the half-plane 
Re(s) < Re(vn+1), with simple poles at the points vu; and residues equal to the 
coefficients b;. In particular, if there is an unbounded asymptotic expansion 


(13.8”) f(t) = So baa, 2—++oo, with limv, = +00, 


it can be extended to all of C, excepting at some simple poles. 

For example, choose the function f(a) = x/(1+2?) = f(x7~!); f(x) ~ xin 
the neighbourhood of 0. This gives the convergence condition Re(s) + 1 > 0, 
and f(a) ~ 1/a at infinity, which in turn gives the convergence condi- 
tion Re(s) —1< 0; the Mellin integral, therefore, converges on the strip 
|Re(s)| < 1. For 2 <a <1, f(x) = S\(-1)"2?"*", which is much better 
than an asymptotic expansion. I'y (s), a priori defined for Re(s) > —1, can 
be analytically extended to all of C, with simple poles at the points —2n — 1 
and residues equal to (—1)”. Similarly, 


at infinity. So T#(s), a priori defined for Re(s) < 1, can be extended to C, 
with simple poles at the points 2n+1. However, I'y(s) = Ip (s) +I (s) on the 
strip | Re(s)| < 1 where these three functions are defined and holomorphic. 
Since the right hand term is meromorphic on C, it follows that I’y can be 
analytically extended to all of C, its singularities being simple poles at the 
points 2n + 1, n € Z, with residues equal to (—1)”. In fact, we will show in 
n° 15 that 
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I'y(s) = 7/2 cos(ms/2), 


which is much more precise, but we rarely have the chance of being able to 
calculate everything explicitly. 

If f decreases rapidly at infinity, ie. if f(7) = O(x7) for all N, there is 
no need to think: integral (5”) converges on all of C and is an entire function 
of s. 

As in the previous example, these two types of results apply if f admits 
unbounded asymptotic expansions in the neighbourhood of 0 or for large z, 
obviously provided that the integral defining I’, is to start with convergent 
on a strip of non-zero width. Otherwise “gluing” the meromorphic functions 
defined by (5’) and (5”) would be impossible: the function 1 has the nicest 
asymptotic behaviour in the world, but its Mellin transform is not defined; 
in this case, I’; (s) = 1/s and TF (s) = —1/s, and so Iy(s) = 0 if I'y were 
well-defined: absurd assumptions lead to absurd conclusions. 

The fact that the Mellin transform does not have simple poles is due to 
the nature of the asymptotic expansions that we have admitted. In more 
complicated cases, there may be multiple poles. Suppose for example that, 
in the neighbourhood of 0, f is the sum of a function of type (6’) and of a 
finite number of terms x? log? x, with p > 0. The contribution made by these 
terms to (5’) converges for Re(s) > —p since logx = O(a~") for all r > 0; it 
is equal to cg(s +p), where 


1 

(13.9) Cq(8) =} log’ x.a*'dz, Re(s)>0. 
0 

Integrating by parts, we get 


8Cq(s) = [1 — qceq-1(s)] . 


As co(s) = 1/s, c1(s) = 1/8 — 1/s?, it follows that co(s) = 1/s — 2/s? + 283 
and more generally 


(13.10) cq(s) = 1/s —q/s* + q(q—1)/s? —... + (-1)%q!/st*". 


Replacing s by s + p, we see that the presence of a term in x? log’ x in the 
asymptotic expansion introduces a pole of order g+ 1 at the point —p. 


(iii) Example: the Riemann zeta function. The function exp(—7u?) being 
equal to its Fourier transform, that of u > exp(—7ru?),where x > 0, is 
v + a—'/? exp(—av?/x); at infinity, these functions approach 0 sufficiently 
rapidly for Poisson’s formula to be written 


ps exp(—an?2) = 27 ¥/? a exp(—1n?/a) , 


where summation is over Z (Chap. VII, n° 28). As will be seen below, the 
results of section (ii) apply to the Jacobi function 
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O(a) = a, exp(—rn?x) (a> 0). 


First, it is C'°°; differentiating the series term by term p times, up to a 
constant factor, we indeed get > n?? exp(—7n?z). Since, for t > 0, tNe~* is 
bounded by a constant My for all N > 0, for any p ¥ 0 there is an upper 
bound of the form n?? exp(—1n?x) < Myn??/(n?x)%. Choosing N > 2p+2, 
it follows that the derived series are all normally convergent in x > c for any 
c > 0, where this is a strict inequality. The result follows. This calculation 
also shows that 


A(x) =1+0(2-%) at infinity 


for all N, the term 1 coming from the term n = 0 of the series, while the 
derivatives satisfy 


0) (x) =O (@-™) at infinity . 
The functional equation then shows that 
O(x) =a? [1+O(x%)] for «—>0. 
The results of section (ii) can, therefore, be applied to 
f@) =0(2")-1, 
a function for which 
f(z) =O(a-%) at infinity, f(x)=1/r-1+O0(a%) at 0. 


Convergence at infinity of the integral defining I’s(s) does not need to satisfy 
any conditions, but convergence at 0 supposes that Re(s) > 1. A formal 
calculation shows that I';(s) is equal to 


+00 +00 
/ dx >, exp (—7n?x”) x?) = = exp (—7n?2”) ved*« 
0 40 0 


+00 
= ; (nay? f exp (-y) y°/?d*y = m-*/7T'(s/2)¢(s) , 


nO 


where ¢(s) = }0,,59 1/n* is the Riemann series converging for Re(s) > 1. To 
justify the permutation of the signs f{ and 5°, it suffices (Chap. V, n° 23, 
theorem 21) to show that (1) the series being integrated converges normally 
on every compact set K C]0,-+0o[, which is obvious since this is the case of 
the theta series and since x°~! is bounded on K, (2) the series 


i / |exp (—7n?x”) | d*x 


n#0 
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converges, which is obvious by our formal calculation since it is proportional 
to that of Riemann. 
Hence the function 


Ts(8) = 087 1(s/2)6(s) = €(s) 


can be analytically extended to all of C, with simple poles at s = 0 and 
s = 1 coming from the first two terms of the asymptotic expansion of f 
at 0; the residues are equal to 1 at s = 1 and to —1 at s = 0. As the 
function 1/I°(s/2) is entire and has simples zeros at 0, —2, —4,..., the function 
C(s) = 1/*Is(s)/I'(s/2) is meromorphic on C. As I'(1/2) = 11/? # 0, 
the simple pole of I’; at s = 1 spreads to ¢(s), with a residue equal to 
nm '/2q1/2 — 1. As 1/I'(s/2) vanishes at s = 0, the pole of I’; at this point is 
neutralized by the zero of the function 1/I’. Therefore, the function ¢(s) has a 
unique singularity in C: a simple pole at s = 1. It vanishes at s = —2,—4,..., 
as well as at several other points less obvious at first. 

It also satisfies a functional equation which can be deduced from that of 
the Jacobi function. Indeed, 


f(l/x) = 0 (1/27) —1 = 20 (2?) -1Ll=2[f(x)+1]-1, 


f(/2) =2f(e)+2-1. 
Hence, for Re(s) > 1, 

1 +oo 

T; (s) -| fla)a'aa = | f(l/x)a*d*a 
+00 
= / [f@)e "+a" —2 "| dz. 

1 
Each of these three function occurring in the last integral is integrable over 
[1,-++o0o[ with respect to d*x, where Re(s) > 0: this is obvious for the last 


two, and the the first one is integrable for all s since it decreases rapidly at 
infinity. In conclusion, 


T;(s) =I} (1-8) +1/(s—1)-1/s 


for Re(s) > 1, and so, by analytic extension, for all s €¢ C. As I'y = ry +I;, 


T;(s) =I} (s)+ TF A-s)+1/(s-1)-1/s, 
which proves that 


€(s) = €(1—s). 
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As pages and pages have been written and continue to be written on the 
Riemann function, we will not pursue these investigations here; for the sequel 
see Chap. XII. 


(iv) A Paley-Wiener type theorem. There are Paley-Wiener type results 
for the Mellin transform. For example, the next fairly long application of 
Cauchy theory; the method and even the result are used to study the relations 
between Dirichlet series and modular functions. 


Theorem 14. Let S; = S(R,) be the set of functions defined and infinitely 
differentiable for x > 0 and that are together with their derivatives rapidly 
decreasing at infinity . For any f € S(R+), the Mellin transform 


+00 
Pee = | f(s)a°d*e = os) 


has the following properties: 


(i) I is defined and holomorphic for Re(s) > 0 and can be extended ana- 
lytically to a meromorphic function on all of the plane whose only sin- 
gularities are at most simple poles at s = 0,—1,—2,...; 

(ii) for any n €N, the function s"Iy(s) is bounded at infinity on every 
vertical strip of finite width.® 


Moreover, 

(13.11) Qnif (x) =i I;(s)e*ds if o >0 
Re(s)=o 

and for allp EN, 


(13.12) 27if (x) =| Iy(s)a7*ds + S- az if —p—l1<o<-—p, 
Re(s)=o 0<k<p 


where ay = Res(I'y, —k) = f((0)/kI. 

Conversely, any function yp satisfying conditions (i) and (ti) is the Mellin 
transform of a unique f € S,, given by (11); then for allm,n EN s™p(s) 
are rapidly decreasing functions at infinity on every vertical strip of finite 
width. 


The proof can be split up into several parts. 
(a) Assertions (i) and (ii) for p = Ty, where f € S. Assertion (i) was 
proved before the theorem. So was the formula 


(13.13) Res(y, —k) = ay = f*)(0)/k!. 
°3 it is in fact bounded on every subset of C defined by inequalities of the form 


a<oa<b, |t| >c where a,b,c € R and c> 0 (set s =o + it). Getting near the 
poles of the function must clearly be avoided. 
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To prove (ii), first note that I’p(s) is bounded on any strip 0 < a < Re(s) < 
b < +o0 due to the simple fact that the integral converges for s = a and 
s = b. Having said this, let us integrate by parts for Re(s) > 0: 


+oo 


sI'y({s) = [- f(x)sx*—!da = f(x)x° - fae'és= 


= —I;(s+1); 


0 


the part fully integrated is zero because (1) f is continuous for « > 0, where 
this is a large inequality, and «* approaches 0 as x tends to 0, (2) f isa 
rapidly decreasing function at infinity. The previous relation generalizing the 
formula sI'(s) = I'(s + 1) can be integrated since f € S; implies f’ € Sy, 
and leads to 


(13.14) s(s+1)...(s+n—1)Iy(s) = (-1)"I ym (s +n). 


By analytic extension, this result holds for all s. If s remains in a vertical 
strip of finite width and if n is chosen to be sufficiently large, then s + n 
remains in a strip 0 < a < Re(s) < b < +00, in which the right hand side is 
bounded. As 


s(s4+1)...(s+n—-1)~s" for large |s| 
and given n, assertion (ii) follows. 


(b) Inversion formula. To prove this, it suffices to check conditions (a), (b) 
and (c) of section (i). The continuity of f and the convergence of { f(x)a*d*x 
for Re(s) > 0 are obvious since f € S(R+); integral (4) involved in the inver- 
sion formula converges for all non-integral negative o because of assertion (ii) 
of the theorem. 


(c) The function f associated to a function y satisfying (i) and (ii). By 
(ii), t+ y(o + it) is a rapidly decreasing function at infinity for all o € R; 
it can, therefore, be integrated over any vertical Re(s) = o 4 0,—1,... and 
integral (4) can be computed. We show that it is independent of o on any 
interval [a, 6] which does not contain an integer < 0. 

Let us integrate y(s)x~* over arectangle bounded by the verticals Re(s) = a 
and Re(s) = b > aand by the horizontals Im(s) = +7. 


|| 2g-"4-¢" 


on the horizontal sides. This constant is independent of T. As for the factor 
y(s), it is O(T~”) for all n since the function s”y(s) is bounded at infinity 
on the vertical strip a < Re(s) < b. The contributions from these sides, 
therefore, clearly tend to 0 as T —+ +oo. Setting w(s) = y(s)a7~*, at the 
limit, 


§3. Some Applications of Cauchy’s Method 113 


; s)ds — s)ds = 271 —k). 
(13.5) of ve)de— f voids= ani > Rests -# 


a<—k<b 


So the integrals over the verticals a and 6 are indeed equal if w, i.e. y, has 
no poles between a and 6. 
Let us calculate the residues of w. By assumption, 


p(s) =an/(s+k)+... 
in the neighbourhood of the simple pole s = —k, where the unwritten terms 
represent a power series in s + k. Besides, 


a8 = ghg— (+) — oF exp [—(s + k) log2] = 2* [1-—(s+k)logr+...] 


is a power series in s + k whose first term is «*. Hence 
(13.16) Res(q, —k) = axa . 


Next, (15) applied to 0 < a < b shows that, setting 


(13.17) 27if (x) = 


p(s)a *ds = ix? i y(o + it)a “dt 
Re(s)=o>0 


for « > 0, where this is a strict inequality, defines a function over R‘, without 
any ambiguity. As |r~"*| = 1, 


ona? |f (x)| < | Ivo + it) |dt 


for any 0 > 0; the second expression being independent of x, f is a rapidly 
decreasing function at infinity. 


(d) Differentiability of f for x > 0. To show that f is C® for strictly 
positive x, it must first be shown that (17) can be differentiated with respect 
to «. Thanks to Theorem 24 of Chapter V, § 7, n° 25, this amounts to checking 
that : 


(a) with respect to x, the function under the f sign has as derivative a 
continuous function of the couple (a,t), which is obvious, 

(b) for any compact subset H C R4,, this derivative, namely —syp(s)x 
is dominated by a function py(t) integrable over R and not depending 
on the parameter « € H (normal convergence). 


—s—l1 
| 


But if 2 remains in an interval H = [a,b] with 0 <<a<b< +o, then 
|sp(s)a~*"| < |sp(s)| (a7 * +b") = pa lt); 


s(s) being a rapidly decreasing as a function of t, the function py is inte- 
grable, giving (b). 
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This allows us to write that 
(13.18) Init (x)= -{ sp(s)z_*—‘ds. 
Re(s)=a>0 


However, the function sy(s), or more generally the product of y(s) with a 
polynomial in s, visibly satisfies conditions (i) and (ii). Therefore, the argu- 
ments used for f show that f’ is a rapidly decreasing function at infinity with 
derivative given by 


Qnif" (x) = + [ als + 1)y(s)z~*~ ds, 


where integration is over a vertical Re(s) = o > 0. Iterating the process, f is 
seen to be infinitely differentiable for x > 0, where this is a strict inequality, 
and all its derivatives 


(13.19) 2mif™ (w) = (-1)” | s(s + 1)...(8 +n—ly(s)a~*-"ds 


are seen to be rapidly decreasing functions at infinity like f itself and for the 
same reason. Integration is obviously over a vertical Re(s) = 0 > 0. 


(e) Behaviour of f in the neighbourhood of 0. The aim is to show that 
f, for the moment defined for z > 0, can be extended to a C® function for 
xz> 0. 

Integral (17) is extended to a vertical Re(s) = o > 0, but it can be 
moved to the left, provided the poles of vy are taken into account : indeed, the 
argument that has led to (15) relies only on » decreasing at infinity. Hence, 
in view of calculation (16) of these residues, 


(13.20) 2ni f(x) = 2ri(ag + aya +... + ana”) +/ p(s)x “ds 
Re(s)=—n—-1/2 


for alln € N; the point —n—1/2 is only noteworthy for being located between 
—n—1 and —n. For Re(s) = —n — 1/2, 


[lelyens|ar= art? Po(o)lat. 


The additional integral in (20) is, therefore, O(a"+!/?), so that (20) is a 
bounded expansion of f in the neighbourhood of 0. Since n € N is arbitrary, 
this gives an asymptotic expansion 


(13.20’) f(c)= So a,e*, 2-0. 
Let us show that it can be differentiated term by term, i.e. that 


(13.20”) f(a) © So kaya’ 1, 2 0. 
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Indeed, formula (18) can also be written as 


nia f'(x) = — i sp(s)a~*ds. 


Passing from f(a) to xf’(a) is done by replacing y(s) by —sy(s), a function 
still satisfying conditions (i) and (ii) in the statement. Hence (20’) can again 
be applied in this case provided that the residue az of y(s) at s = —k is 
replaced by that of —sy(s), namely ka, since the point s = —k is a simple 
pole of y; so 


eT (a) S- ka,a* , 


which gives (20”). 

Iterating the argument, for all n, the derivative f(”) (x) is, therefore, seen 
to have an asymptotic expansion in the neighbourhood of 0, obtained by 
differentiating term by term that of f n times. To deduce that f can be 
extended to a C™ function on x > 0, where this is a large inequality, it 
remains to prove a rather easy general result: 


Lemma 1. Let f be a function defined and infinitely differentiable on an 
open intervalO < a < b. f can be extended to an infinitely differentiable 
function on the interval 0 < x < b if and only if the following conditions 
hold: 


(a) f has an asymptotic expansion 


f(a) = So aya’, z—0; 


ken 


(b) for alln €N, the derivative f(x) has an asymptotic expansion ob- 
tained by differentiating term by term that of f n times. 


First of all, the relation f(a) = ao + a,x + 0(x) shows both that f(x) 
tends to ao as « approaches 0 and that if we define f(0) = ao, then the 
function f thus extended has a derivative equal to a, at the origin. Since, 
by (ii), f’(v) © a1 + 2aga +... and so lim f’(#) = ay, the extension of f at 
0<a< bis C'. Applying these arguments to f’ instead of f, f’ is seen to 
have as extension a C! function, so that f is C?, etc. 

A variation of lemma 2: suppose that 


: (n)() — flr) 
Jim fe) = FM OH) 


exists for all n. Indeed, for0O<a<a+h, 
|f(e@+h) — f(x) — f'(@)h| < hesup|f'(e+k) — fi(z)|, 


where the sup is extended to k € [0,h]; since f’(0+) exists, this sup is < r 
for sufficiently small h, and so as x tends to 0, 
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|f(h) — f(O+) — f'(O+)h| < rh; 


the function f, extended at x = 0, therefore, has a derivative f’(0) = f’(0+). 
A recurrence argument generalizes the result to successive derivatives. 

As shown above, f and its successive derivatives are rapidly decreasing 
functions at infinity. So f € S(R+). 


(f) The Mellin inversion formula for y. We now show that y is indeed 
the Mellin transform of f, i.e. that formula (17) defining f can be inverted. 
Since it is merely a Fourier transform in disguise, it amounts to checking the 
conditions used in section (i) for showing that the inversion formula applies 
in both directions : 


(a) the function y is continuous on the vertical Re(s) = 0 > 0, 
(b) integral (17) is (absolutely) convergent, 
(c) f|f(x)x*|d*x < +00 for Re(s) > 0; 


the order of the conditions to be checked is changed as here we need to 
calculate y in terms of f and not f in terms of ¢. 

Checking (a) is trivial (y is holomorphic), we would never have thought 
of writing (17) if condition (b) was not satisfied, finally (c) is obvious since, 
as seen above, f is continuous at x = 0 and, as noticed at the end of part (c) 
of the proof, is a rapidly decreasing function at infinity. 

The last assertion of the statement is totally unrelated to the Mellin 
transform; apply the following result: 


Lemma 2. Let vy be a function defined and holomorphic on an open set 
U:a<Re(s)<b, Im(s)>c 


and m be a real number. If the function s™p(s) is bounded at infinity on the 
closed vertical strip of finite width contained in U, then the same holds for 
all of p’s derivatives. 


To see this, argue as in n° 4, (iv). Take a closed strip 
B:Im(s)>c >c, a <Re(s) <b’ with a<a’<U <b 
contained in U and choose some r > 0 such that 
c<d-r, a<d—-r, V4re<b. 
The closed strip 
B’ :Im(s)>c-—r, a-—r<Re(s)<b+r 


is contained in U and, for all s € B, the disc centered at s and of radius r 
is contained in B’. For s € B, Cauchy’s formula (not quite correct, but the 
forgotten numerical factor has no influence on the orders of magnitude) 
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prs) gy of [s + re(t)] e[—(n + 1)t] dt 
shows that (same remark) 
lo (s)] <r. sup |p(w)| 
where the sup is extended to the points w of the circle |w — s| = r. By 
assumption, there is a constant c,, such that |w™ f(w)| < cm at infinity in 
B’. |s| —r < |w| < |s] +r on the circle. Hence |w| = |s| for large |s|, and 
|1e(w)| = O (Jwl-™) =O (|s|[-™) . 
So 


sp) (s) 


<|s™|r-"O (lal) =O(1) in B, 


and the lemma follows. It comes under the same philosophy as that of Weier- 
strass’ convergence theorem (Chap. VII, n° 19). 


Theorem 12 characterizes Mellin transforms of functions belonging to 
S(R,), but the method cannot obviously be applied to other cases. For ex- 
ample, let us try to characterize the Mellin transforms of function f that 
have the following properties on R*, : 


(a) f and its successive derivatives are C™° and rapidly decreasing functions 
at infinity; 
(b) jf has an unbounded asymptotic expansion 


f(a) ® ane” 
N 


in the neighbourhood of 0, with real exponents réels ug < uy < ... such 
that limu, = +00; 

(c) for all k € N, in the neighbourhood of 0, the derivative f(x) has an 
unbounded asymptotic expansion obtained by formally differentiating 
that of f. 


As seen at the start of this n°, the Mellin transform 


ols) = f fejata's = 1y(s), 


a priori defined for Re(s) > —uo, can be extended to a meromorphic function 
on all of C whose poles, all simple, are the points —u,. The formula sI’;(s) = 
—I(s +1) obviously continues to hold for Re(s) > —uo. The proof is the 
same as before. As thanks to (c), the successive derivatives of f also clearly 
satisfy above conditions (a) and (b), it can be iterated and as in the proof of 
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theorem 14, it follows that the function s”y(s) is bounded at infinity (hence 
so are its successive derivatives) on any vertical strip of finite width. 

We leave it to the reader to show that conversely any meromorphic func- 
tion y(s) on C with these two properties (its only singularities are simple 
poles at points —ug > —u; >... on the real axis and it is rapidly decreasing 
at infinity on any vertical strip of finite width) is the Mellin transform of a 
function of the previous type. 


14 — Stirling’s Formula for the Gamma Function 


The simplest example of an application of the inversion formula can be ob- 
tained by choosing f(x) = e~”, obviously in S(R1). Its Mellin transform is, 
by definition, [(s). Hence 


1 
(14.1) e"* = 


1 
=— I(s)x-*ds = — | T it)a °— "dt 
Pi Ire (s)a” *ds a i (o + it)a dt, o>0, 


and a slightly less simple formula for —p— 1 < a < —p. This result is due 
to Mellin himself (1910), but it can be easily obtained without invoking the 
general theorem: it suffices to reconstruct the proof in this particular case. .. 

Theorem 14 also shows that, on any vertical not passing through a pole, 
the function t +> I'(o + it) is in the Schwartz space. This is a weak result: 
indeed, formula (27) that will be proved at the end of this section shows that 
the I’ function decreases exponentially on every vertical. 

This result is based on an evaluation (Stieltjes) which for s € N reduces 
to Stirling’s formula 


(14.2) nl w (Qn) /2nrtt/2e-" as n —> +00 
of Chapter VI, n° 18, namely 
(14.3) I'(s) ~ (Qn)1/258-V/2e-8 


as s tends to infinity in an angle | Arg(s)| >a — 06 with 6 > 0; for integer s , 
the equivalence with (2) follows from the relation 


(n—1)!=nl/n~ (Qn)Pn™-Ve-” 


We first prove® relation (3), then we show how to deduced the behaviour of 
the I’ function on the verticals. These results arise in fields such as analytic 
number theory and in the study of the asymptotic behaviour of important 


4 The rest of this n° is essentially a fairly concise, detailed presentation of Rem- 
mert 2, Chap. 2, §4. Dieudonné, Calcul infinitésimal, IX.7.6, gives a genuine 
asymptotic expansion by using in full the Euler-MacLaurin formula. N. Bour- 
baki, Fonctions d’une variable réelle is another reference. 
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special functions. Proving them is a complex logarithm calculation exercise 
involving all the pitfalls of the topic.® In what follows, set 


C-R_=C,. 
The idea behind the proof is simple. A naive calculation shows (3) to be 
seemingly equivalent to 
(14.4) log P'(s) = 5 loa + (5 1/2) logs —s+o(1). 
However, the formula 
(14.5) I(s) =limn!n*/s(s+1)...(s+n), 


which holds for all non-integral s < 0, seems to show that 


(14.6) log I'(s) = lim {wt + slogn — 3 log(s + n| 
0 


and (2) shows that 
(14.7) log(n!) = 1/2 log(27) + (n+ 1/2) logn—n+ o(1). 


The problem, therefore, appears to lie in the evaluation of the sum of the 
log(s + p), which, as we shall see, is made possible by the Euler-MacLaurin 
summation formula (Chapter VI, §2, n° 16). Combining these results, (4) 
and hence (3) can be expected to be justified. But complex logarithms, not 
to speak of their limits, cannot be used like those of Neper; hence their 
meaning will first need to be specified. 

Let us start by specifying the meaning of the expression s 
in (3), namely 


s—!/2 occurring 


(14.8) s°1/2 — exp [(s—1/2)Logs] forseC,, 


where, on C,,the Log function is the uniform branch which reduces to the 
Neper function on the positive real axis: 


(14.9) Log z = log|z| +7 Arg(z) with | Arg(z)| <7. 
This function allows the following more general definition 
z” =exp(wLogz) for z€C,, weEC. 


For technical reasons, the Log function needs to be extended to all of C* 
by setting 


®5 See Serge Lang, Complex Analysis (Springer-New York, 4th. ed., 1999), pp. 422— 
428 for an example of a proof where complex logs are used without precaution. 
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(14.9’) L(z) = log|z|+iArg(z) with — 7 < Arg(z) <+7 


on C*, and so z = exp [L(z)] for all z € C*. The L function is not holomorphic 
on all of C*; it is discontinuous at every point of R*. To be precise, let us 
consider a sequence of points z, € C* converging to some z € C*. If z€ Cy, 
clearly Arg(z) = lim Arg(z,,), and as a result D(z) = lim L(z,). But if z € R_, 
then Arg(z) = +m by convention whereas the arguments of the z, are, for 
large n, near +7 or to —7 according to the values of n; hence there are 
integers k, € {—1,0} such that 


Arg(z) = lim [Arg(z,) + 2k,77] . 
In conclusion, in all cases, there exist k, such that such that 
(14.10) LC(lim z,) = lim [L(zn) + 2kn 7] . 


In such a case, L(z,) may approach a limit. The same then holds for 2k,7i, 
which is, therefore, constant for large n; as a result, 


(14.10’) L(lim z,) = lim L(zp,) mod2zi if lim L(z,) exists. 


The functional equation of the logarithm can be generalized with some 
precaution to the function L(s). With definition (9’) for the argument, 
clearly®® 


2 = By... => Arg(z) = 4” Arg(z,)mod2r, 
so that 
(14.11) L(z1...2n) =} L(zp) mod 2ni, 
and 
(14.12) L(z1... 2m) = S5 L(zp) = —7 < So Arg(2p) < +0. 


For real g > 0 ands = a0 +it € C, a = 27 exp(itlogx); Since the 
argument of a € R4 is zero, (12) shows that 


L(a*) = L(x?) + L [exp(it log x)| 
and as the argument of exp(it log x) is equal to tlog 2 mod 27, 
(14.13) L(z*)=sL(x) mod2xi forzeR,, seC; 
the formula L(x*) = sL(a) holds if s € R since calculations are then in R}. 
56 In what follows, write a = b mod 2a or mod 277 to mean that a — 6 is a multiple 


of 27 or of 277 as the case may be. The traditional notation is the sign =, which 
is unnecessary if it is followed by an indication such as “mod 25”. 
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We will also need the relation 
(14.14) Li+2z)=2—27/2+ 27/3... for |z| <1; 


it is true for real z since L(1 + z) = log(1+ z) then, and so by analytic 
extension also on the unit disc; besides, in effect (14) only deals with the Log 
function. 

Let us now specify what should be understood by Log I’(s). The I’ func- 
tion is holomorphic and never zero (infinite product expansion) on the simply 
connected domain C,. Hence the equation exp [f(s)] = I'(s) has holomorphic 
solutions in C,, namely (§1, n° 3, Corollary 2 of Theorem 3) some primitives 
of I’(s)/I'(s). Choose the function 


s I'(z) . 
T(z)’ 


(14.15) Log I'(s) =| 


where integration is along a path connecting 1 to s in C1, the simplest being 
the line segment. With this definition, 


(14.16) LogI(s) =logI'(s) for s € Ri 


since then the real function I’’(x)/I'() is integrated over the interval (1, s). 

It may be though that LogI'(s) = L[I°(s)]. This would be the case if 
s € Cy, = I(s) € Cy was known to hold, for then the right hand side, 
consisting of two holomorphic functions on C+, would, like the left hand one, 
also be so. The obviously exact formula on R*_, would then hold in all of Cy. 
But the assumption on which this argument is based does not seem to be 
exact.°7 Since, by definition, 


exp [Log I’(s)] = P'(s) = exp {L [I'(s)]} 
the relation 
(14.17) Log I'(s) = L[I'(s)] mod 27i 


is exact and for us, this suffices. 
Exercise. Using the infinite product, show that 


(14.18) —I"(s)/I'(s) =C +1/s4 mI : -) 


1 Stn mn 
if s is not an integer < 0, then that 


(14.19) Log I'(s) = —Cs — Logs + a [s/n — Log(1+ s/n)] forse Cy. 


°7 Remmert 2 alludes discretely to it on p.42 but, unfortunately, does not prove it, 
and I do not see how to do so. 
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These preliminary explanations allow us to return to 
I'(s) = lim [n!n*/s(s +1)...(s+n)] . 
First of all, by (10), (11) and (17), 


(14.20) Log I°(s) = lim fst) + slogn — >». L(s+p)+ aml 
0 


for properly chosen k,, € Z, and by Stirling, 
1 
(14.21) log(n!) = 5 log(2m) + (n+ 1/2) logn — n+ o(1). 


To evaluate the sum of the L(s + p) for given s € C1, set f(x) = L(s+ 
x) = Log(s + x) for « > 0; we get a C™ function such that f’(x) = (s + 
x), f’(x) =—(s + 2)~? since Log z is holomorphic on C, and has 1/z as 
derivative. Instead of referring the reader to Chapter VI, §2, n° 16 for the 
general Euler-MacLaurin formula, let us introduce the functions 


(14.22) P\(xz)=a2-1/2, P(x) == (27-2). 


|e 


So Pj = 1 and Pj = P,. As P2(0) = P2(1) = 0, integrating twice by parts 
immediately show that 


om) erie 
: [ 


= ; [f(x +p) +fa+p+a+ f f(a 4+ p)Po(a)dz . 
0 


Setting 
(14.23) P3 (x) = P2 (a — [a]) 


to be a function with period 1 equal to P2 on (0,1), transform the last integral 
into that of the function f’(x)P3(x) on (p,p+1). Summing from p = 0 to 
ya (a 1, 


" 1 ° v1 * 
[ fede =-F 60+ Fe) + Vier + fo Peri eyes 


0 


follows. Integrating by parts f(x) = Log(s + x), we get 


[ seoyte = (s +n) Log(s +n) —sLogs—n, 
0 


and so 
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ye Log(s + p) = (s +n) Log(s + n) — 
0 


1 
sLogs—n4 5 [Log s + Log(s + n)] — 
-{ (s+ 2)~?P3(x)dz. 
0 


Formula (20) then shows that, modulo some small calculations, Log I’(s) is 
the limit of a sequence whose general term is equal to 


Zn = 5 los(2m) — (n+ 5+ 1/2) [Log(s + n) — logn] + (s — 1/2) Log s + 
+f (s+ ay *Pile)ae 


up to a multiple of 27i. If lim z, = z is shown to exist, then relation (10’) 
will show that Log I'(s) = 2kai + z, and hence that ['(s) = e. 

First, the function P} (x) being bounded, the integral over (0,n) converges 
to what Remmert denoted by 


+00 +00 
(14.24) p(s) = | (s+ 2)? P#(x)dz = ay (s+ a)~1P*(x)dz. 


On the other hand, s+n = (1+s8/n)n, and as Arg(n) = 0, this leads to (12) 
and Log(s + n) = Log(1 + s/n) + logn. So, by (14), 


Log(s +n) — logn = Log(1 + s/n) = 8/n+O(1/n?) . 
As a result, 
lim(n + s + 1/2) [Log(s + n) — logn] = s. 
Thus 
lim Zp, = ; log(27) — s + (s — 1/2) Log s + p(s) 
exists and the following formula holds: 
(14.25) I'(s) = (2r)/259-V/2e-%eH(9) for all s Cy. 


Hence it remains to show that e”“) approaches 1 as s tends to infinity 
in Cz in not too arbitrarily, and that because of this p(s) tends to 0. Set 
s =r.exp(ip). For x > 0, 


|ls+a|? = (2+rcosy)? +r? sin? y = r? + 2ercosy 4+ 2? = 
= (r+2)? —4ar sin? y/2; 


as 4xr < (r+ 2)? — calculate the difference -, 


124 VIII — Cauchy Theory 
ls+a|? > (r+) cos? y/2. 


Since the periodic function P(x) is contained between 0 and 1/8 everywhere, 


+00 
8|(s)| cos? y/2 < | (r+2) "de =1/r 
0 
or (Remmert, p. 52) 
(14.26) |u(s)| <1/8|s|cos?p/2 in Cy. 


Hence lim p(s) = 0 if s tends to infinity in such a way that the product 
|s| cos? ~/2 does so as well, for example if we consider a subset of C defined 
by 


|Arg(s)| <7-—6 with O0<d<7. 


The Stieltjes formula holds under this condition, for example if s remains in 
the half-plane Re(s) > c and hence in a vertical strip of finite width. 


As for Remmert, he avoids all the limit calculations that have been de- 
tailed here. He a priori introduces the function p(s) and using the second 
integral, he notices through an elementary calculation that 


p(s) — w(s +1) = (s + 1/2) Log(1 4+ 1/s)—1, 


then that the function f(s) = s*~!/?e~%e#(*), holomorphic on C4, satisfies 
Wielandt’s assumptions [n° 10, (i)]. Hence f(s) = f(1)I(s), and in particular 
f(n) = f(1)(n— 1)}; as p(n) obviously tends to 0, comparison with Stirling’s 
formula shows that,f(1) = (27)!/?. An excellent example of Blitzbeweis ! 

To show that the gamma function decreases exponentially on the verticals, 


[se a ekel(s—1/2) Log s] 
still needs to be evaluated For s = a + it, 
Re [(s — 1/2) Log s] = (o — 1/2) log|s| — t Arg(s). 


As s tends to infinity in the vertical strip B of finite width, the argument of 
s tends to 7/2 if t tends to +00, and to —7/2 if t tends to —oo; in the first 
case, using the power series for Arctg, 


n/2— Arg(s) = Arctg(o/t) = o/t + O(t~?) 
since 0 remains in a compact set, and so 
—t Arg(s) = —n|t|/2 +o + O(t-), 


a result which also holds in the second case. For the same reason, 
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|s| = (02 +42)"/? = |t| (1+. (t-2)) , 


log |s| = log |¢| + log (1 + O (t~*)) = log |t) + O (#77) . 
So finally, 
Re [(s — 1/2) Log s] = —n|t|/2+ (0 — 1/2) log|t| +o +0 (t-2) , 


and so 


gi 1/2 smn e= a(t 


at infinity in B 


|~e 


since the factor exp [O(|t|~?)] tends to exp(0) = 1. Returning to (29), we 
finally get the evaluation sought (Remmert tells us it is due to the Italian 
Salvatore Pincherle, 1889), namely 


(14.27) [D(a + it)| ~ (2m)!/2|¢]@-2/2e—mlE/2 
It is reassuring to see that this result is compatible with formula (10.5.8) 


|'(1/2 + it)|? = 2/ cosh rt. 


15 — The Fourier Transform of 1/ cosh 7x 


Like (1/2 + it), the function 1/cosh7z is in the space S(R). This can be 
directly verified since its n-th derivative is obtained by dividing by cosh?” ra 
a polynomial in sinhaa# and coshzz all of whose monomials are of total 
degree < 2” ; now, coshmz ~ $ exp(m|z|) at infinity. 
We show that it is identical to its Fourier transform. This result will later 
give rise to the strange identity (17) that can be found at the end of this n°. 
The Fourier transform of 1/ cosh 7a is the integral 


(15.1) 2 f OM =? f wa a 

as shown by the change of variable e~™' = x. More generally, the integral 
+00 xv 

(15.2) p(s) -{ i |Re(s)| <1, 


therefore, remains to be computed. It is the Mellin transform of 
f(z) =a/(1+2") = f(1/x). 
If 
p(s) = 1/2 cos(7s/2) 


is shown to hold, the Fourier transform sought will then be equal to 
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2p(2iy)/a = 1/ cosh zy 
as expected. We give three methods for this calculation. 
First proof. It is the shortest. Start with the formula 
I'(s)P(1-s) =7/sinas 


and, for 0 < Re(s) < 1, write 


rara—s) = / ent ada i e~Yy-Sdy = / | e~®-9 (a /y)8d*ady = 
= fay fe —"-4(g/y)8d og Pa —2-Ugdty = 
= forts fovea 


where all integrals are over ]0,+o00[. These transformations are justified by 
the Lebesgue-Fubini Theorem (theorem 25 of Chap. V, n° 26 would suffice) 
since all functions considered are integrable over (0,-+0o) for 0 < Re(s) < 1. 
The change of variable x +4 x? in the last integral, which transforms d*x into 
2d*x, then shows that 


I'(s)P(1— s) = 2p(2s — 1), 
and so, replacing s by (s + 1)/2, 
p(s) = 7/2 cos(ms/2) 
for | Re(s)| < 1, qed. 
Second Proof. 


(15.3’) f(x) =a2-a +2° +... for |z| <1, 


(15.3”) f(z) =a -2 34... for || > 1. 


A simple idea consists in multiplying series (3’) and (3”) by 2* and in in- 
tegrating them term by term over (0,1) and (1,+co) with respect to d*x 
taking into account the fact that the integral [ x*d*x extended to (0,1) or 
to (1,+00) is equal to 1/s or —1/s when it converges. A formal calculation 
thus gives 


p(s) = [1/(s +1) —1/(s +3) +...] — [1/(s — 1) —1/(s—3) +... 


for | Re(s)| < 1 since all integrals in question are then convergent. But the two 
series obtained, though semi-convergent (n° 9), are not absolutely convergent. 
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Thus a priori the permutation of the signs > and f{ seems suspect. To justify 
it, replace (3’) by the identity 


f(z) =r faa se (-—1)"22""! a (—1)"**g7"** F(z) 
which is, up to a factor x, just the relation 
i/d—-q=1+¢4+...¢q% +" /(1-g) 


for q = —a?. Hence the contribution y~(s) from the interval (0,1) is equal 
to 


( 1)P +1, .— 
G252u” 2) s . 


for all n > 0. This presupposes that Re(s) > —1 in order for the integral at 
p = 0 and hence for the following ones to be convergent, but in fact holds for 
all s € C by analytic extension since y~ (s) is obviously meromorphic on all 
of the plane. However, for any s € C, 


1 gs t2nt2 
yp (s+2n+4 2) =|} — -d*x for Re(s)+2n+2>0, 
0 1 + a2 
hence for large n. As n increases, the function °*?"*? /(1+2?) converges to 0 
everywhere on |0,1{ while remaining, in modulus, < 1 for large n. Therefore, 
the integral tends to 0 (dominated convergence with respect to the measure 
d*x), and once again the series is convergent and the relation 


Oey — 


i s+2p+1 
holds for all s. 


To deal with the contribution y*(s) from the interval (1,-+oo) to the 
calculation of y(s), note that y*(s) = yp (—s), and so 


ety = 


rae ia? a 


For | Re(s)| < 1, 


+00 s __4)pP 
oe) = f de = > VP = n/2.008(9/2) 


= s+2p+1 


again holds thanks to (9.7”). 


Third Proof. The reader may be happy with these proofs, but this § is 
supposed to present applications of Cauchy’s formula. So here is another way 
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of calculating y(s). It is far less simple and miraculous, but can be generalized 
to all rational fractions. 
It consists in integrating the function 


g(z) = 24/(1+2”) 


along the contour py (see figure below) for | Re(s)| < 1. In the above, z* = 
exp(s Log z) in C — R4, where 


(15.4) Logz = log|z|+7Argz, O< Argz < 27. 


We get 27i(p;+p_i), which involves the residues of the function at the simple 
poles i and —i. Now, by (4), 


pi= lim(z -_ i)z°/ (1 ate #°) _ a8 /2i - ar? fa, 
pi= pF? 194 follows similarly. As a result, 
(15.5) / zdz/(1+2*) = me™s/2 (1—e7™*) . 
LL 


The integral y(s) remains to be deduced from all this. 


Fig. 15.14. 
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Let r and R be the radii of the two circles and +6 be the ordinates of the 
two segments of the horizontals that jz consists of. The segment with ordinate 
+6 contributes 


R’ 
(15.6) ‘i (x + i6)*dax/ [1+ (a + id)?] 
to the integral, where the limits r’ < r and R’ < R tend to r and R as 6 
tends to 0. Since the argument of x + 7d approaches 0 as 6 tends to 0, for all 
real x > 0, the function 


F(a, 6) = (a + id)*/ [1+ (a + i6)?] 


tends to F(a,0+) = #°(1+2?)—!, where 2° = e*!°8* takes its usual value 
for real x > 0 (Chap. IV). The limit of integral (6) can therefore be assumed 
to be the extended integral over all of the interval [r, R] of F(a, 0+), but this 
requires justification, which will be provided by the theorem of dominated 
convergence. 

First, it is clear that r’ > 1/2 for 6 sufficiently small. Integral (6) is, there- 
fore, the one over the fixed interval (r/2, R) of the function equal to F(x, 6) 
between r’ and R’ and 0 elsewhere. This new function tends to F(a,0+) in 
|r, R[ and to 0 elsewhere in the interval (r/2, R). On the other hand, the for- 
mula defining F(x, 6) continues to be well-defined for 7 > 0 and 6 = 0 since 
the result is obviously a continuous function on the product set Ri x Ry. It 
follows that f is bounded on the compact set {x € [r/2,R] &0<6< 1}. As 
5 tends to 0, the modified function F(x, 6) integrated over [r/2, R] tends to 
the function equal to F(x,0+) on |r, R[ and zero elsewhere, while remaining 
dominated by a fixed constant. As integration is over a compact set, passing 
to the limit is justified and finally 


(15.7) tim | “Fle, 8)de = [ F(a, 0+)dx = [ a'dz/(1+ 2x7) 


as expected. 

The integral along the segment with ordinate —d can be dealt with in 
a similar way; it is necessary to change the direction followed and to take 
into account that the argument of x — id tends to 27, which introduces a 
factor e?'S = e(s) in the calculation. The limit value is, therefore, integral 
(7) multiplied by —e(s). 

Hence, for given r and R and 6 tending to 0, the total contribution from 
the segments of horizontals is equal to 


R 
(15.8) (1 e(s)) f 2°de/(1+22) . 


By (2), this expression tends to (1—e(s))y(s) as r and R tend to 0 and +oo. 
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To show that the contributions from the circular arcs tend to 0, use the 
lemma from the introduction to this §: in U = C —R, and for all s € C, 


(15.9) |z?| x |z[Re) in U. 


As |1+2?|-! = O(R™?) over the large circle, the integral is O (RR°s)—1) 
and tends to 0 since Re(s) < 1. 
Over the small circle, |z*| = O (rRe)) and (1+27)-! ~ 1. Therefore, the 


integral is O (r!+Re(s)) | and so we reach the same conclusion since Re(s) > 


Ultimately, taking (4) into account, we get 
(1 = e278) (5) =| - me™9/2(1 — ens) 
bb 


for | Re(s)| <1, and so 
(15.10) p(s) = 1/2 cos(7s/2), 
which ends the third proof. 


This method shows how to calculate the Mellin transform of a rational 
function f(x) = p(x)/¢q(a) without poles in R,. As f is finite at x = 0, 
the integral converges in the neighbourhood of 0, at least for Re(s) > 0. 
If d°(q) — d°(p) = n, then at infinity, f(z) x 2—" and hence f(x)z°-1 x 
x®—"~1 so that the integral converges for Re(s) < n. The Mellin transform 
is, therefore, a priori defined on the vertical strip 0 < Re(s) < n. As I's(s) is 
obtained by integrating f(2)x°~' with respect to the measure dz, 


(15.11) [1 — exp(2mis)] I's(s) = 2mi Res [z°"' f(z), a] , 
the sum being extended to all the poles of f. If 
F(z) = 0 Ag/ (2 - ax) 


only has simple poles, then Res [z*~! f(z), ax] = Anaz 
For example, for f(z) = (1+ z)~! the Mellin integral converges for 0 < 
Re(s) < 1 and 


(—1)*-1 = exp [mi(s — 1)] = — exp(mis) 


needs to be calculated. 


+00 s 
(15.12) | es m/sinas for 0< Re(s) <1 


immediately follows. For f(z) = (z—a)~* with a ¢ Ry and k > 1, the residue 
of z° f(z) is the coefficient of (z — a)*~! in the Taylor series for z* at a. Now, 
by Newton, for |z — a] < |al, 
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Ss 


z= [at (z—a))* =a* [1+ (z—a)/a]* = 
=a*S°s(s—1)...(s—p+1)[(z-a)/a]? /pl. 


The residue at a is, therefore, e . aa and so 


(i= ont) oon oe dn = Dak 8 s—k+1 for Re(s) <k 
e , @—oF x= 202 k—1 a orU < nels) < kK. 


The fact that the function 1/ cosh rz is identical to its Fourier transform, 
can at first seem a mere curiosity only interesting because it gives rise to 
exercises. The Fourier transform of the function 1/ cosh rtz is t~+/ cosh(7a/t) 
for t = 0, and as these are functions in S(R), Poisson’s summation formula 
applies : 


(15.13) S © 1/cosh(rn/t) = t $°1/cosh mnt. 
Let us then consider the similar series 
(15.14) f(z) = 55 1/cos(rnz), z¢R, 


and show first that it converges normally on all of the half-plane of the form 
Im(z) > r > 0. Indeed, in this half-plane, 


etm (ny—ing) | om(—nytinz) > 


2 |cos(mnz)| = 


> etlnlr _ eo tlnir > etlnlr —1. 


Therefore, the convergent series > 1/(e™!"!” — 1) dominates series (14) in the 
half-plane considered. 
Let us now show that f satisfies two simple functional equations. First, 


(15.15) f(z+2) = f(z). 
On the other hand, relation (13) means that 
(15.16) f(-1/2) = (2/4).F() 


holds for purely imaginary z = it. The two sides being analytic on the half- 
plane Im(z) > 0, (16) holds in it. 
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Relations (15) and (16) resemble those proved in Chap. VII, n° 28 for the 
Jacobi function 


A(z) = 5 exp (win?z) =1+2(q+q+t+P+aq'*+...) : 


where q = exp(ziz). As an aside, note that the 6(x) series used to obtain the 
functional equation of the zeta function is in fact the value of 6(z) for z = ix. 
Here too, the series converges for Im(z) > 0. Thanks to Poisson’s summation 
formula and to the fact that the function x > exp(—7?) is equal to its 
Fourier transform, we showed that 


O(—1/z) = (z/i)/?0(z). 


This is why Riemann used it to prove the functional equation for his series 
¢(s). We also have 


O(z +2) = A(z). 
Therefore, the function 6(z)? satisfies (15) and (16). In fact, 
(15.17) f(z) =0(z)?. 


In Chap. XII, the proof of (17) will lead us directly to the theory of modular 
functions and to the classical formula giving the number of ways an integer 
can be represented as the sum of two squares. 


IX — Multivariate Differential and Integral 
Calculus 


8 1. Classical Differential Calculus — § 2. Differential Forms of De- 
gree 1 — § 3. Integration of Differential Forms — § 4. Differential 
Manifolds 


§ 1. Classical Differential Calculus 


The aim of this § is to present the differential calculus of multivariate func- 
tions in the framework of finite dimensional real vector spaces; the case of a 
2-dimensional space having been dealt with in Chap. III, $5, we will mostly 
generalize the results, the proofs being the same as in dimension 2. Nonethe- 
less, this § also contains considerations about tensors that are not found 
everywhere. 


1 — Linear Algebra and Tensors 


(i) Finite-dimensional vector spaces .1 The elements of such a space E are, 
depending on the context, called “ points” or “vectors”, the numbers — real or 
complex depending on needs of the analysis — by which vectors are multiplied 
being “scalars” ; in what follows, the letter K will indifferently denote R, C 
or any other field in which the scalars vary. A basis for an n-dimensional 
vector space F is a family (a;) of n linearly independent vectors, i.e. such 
that any h € E can be written in a unique way as h = >> hya;, the scalars h; 
being the “components” or “coordinates” of h with respect to the basis 
considered.” 


' For details and proofs, see for example sections §§10 to 24 in Cours d’algébre 
(Hermann, 1966 or 1997) by the author, or else the somewhat condensed thirteen 
pages of the Annex to Eléments d’analyse, vol. 1, by Dieudonné, or Serge Lang, 
Linear Algebra (Springer, 1987), etc. 

? The use of the letter h instead of x to denote vectors is due to the fact that in 
analysis, vectors occur mostly as increases of a variable, like in the notion of a 
differential defined later. 
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If E and F are vector spaces over the same field K, their Cartesian product 
Ex F can be regarded as a vector space over K by defining the fundamental 
operations by 


(y)+(e",y") =(e' +0",y'+y"), te,y) = (tz, ty). 


With the same assumptions, a map u: E —> F is said to be linear if 
u(x + y) = u(x) + u(y) and u(ta) = tu(x) for any vector x and y and any 
scalar t; more generally, 


u (© tii) — S- tju (aj) 


for all vectors x; and scalars t;. If (a;) forms a basis for the initial space E, 
then for any vector h = > h'aj, 


(1.1) u(h) = >, hiu; where uj=u(aj)€F. 


If (b;) is a basis for F’, setting u(a;) = )> u7b;, the table of coefficients u? is 
the matrix of u with respect to the chosen bases of FE and F. 

For F = K, the term linear functionals or, sometimes that of covectors is 
used. The u; € K are the coefficients of u. The set of these forms, equipped 
with the obvious algebraic operations (sum of two forms, scalar product), is 
the dual space of E, written E*; the following notation is often used: 


(hu) =u(h) for hE E, we E*, 


similarly to a scalar product. In analysis, the case K = R and F = C is 
constantly needed; the term complex linear functionals is then used; they 
are given by (1) with coefficients u; € C and their set, the dual complex space 
EG, is a complex vector space whose dimension over C is equal to that of E 
over R. Each basis (a;) of E has an associated dual basis (a’) of E* over R 
or of He over C consisting of linear functionals 


(1.2) a’:heus hi 


on E. 
Each linear map A: E —+ F is associated to its transpose ‘A or A’ : 
F* —+ E*, given by the relation 


(A(h), u) = (h, *A(u)) ; 


this definition is justified by the fact that for given u, the left hand side is a 
linear function of h € E.'(BA) ='A'B holds for all linear maps A: EH —> F 
and B: F —>G. 

Classical results on determinants will also be needed. If dim(E£) = n, 
then, up to a constant factor, there is a unique function D(hi,...,hn) € K 
of n variables h; € FE satisfying the following two properties: it is multilinear, 
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ie. a linear function of h; when the other variables are held fixed, (ii) it is 
alternating, i.e. changes sign when two of the variables h; and h; are per- 
muted. Choosing a basis (a;) for E and assuming D(a1,...,@n) = 1, which 
determines D, the number D(hi,...,hn) is the determinant of the h; with 
respect to the given basis. It is non-zero if and only if the h; are linearly in- 
dependent, i.e form a basis of EF. The formula for calculating D(hi,..., hn) 
explicitly from the coordinates of the h; can be found everywhere. 

Given a linear map u: EF —> E, the determinant of u is the determinant 
of the vectors u(a;), where (a;) is an arbitrary basis for E; it does not depend 
on it. Its basic properties are (i) 


(1.3’) D{u(hi),...,u(An)] = det(u)D (hi, ..., hn) 
for all h; € E, and so 
(1.3”) det (uo v) = det(u) det(v) 


for all u,v : EB —> E, (ii) u is injective (or what amounts to the same in 
finite dimension, surjective or bijective) if and only if det(u) 4 0. 

More generally, consider a linear map u: E —> F, where E and F' may 
have different dimensions, and let r be its rank, i.e. the dimension of the 
subspace u(E) of F. Let A = (u?) be the matrix of u with respect to two 
arbitrary bases of F and F’. Square matrices of order < min[dim(£), dim(F’)] 
can be extracted from it by arbitrarily choosing the same number of rows and 
columns of A. Having said that, r is the largest integer for which a square 
matrix of order r and non-zero determinant can be extracted from A. 

Finally, note that instead of “finite-dimensional vector space”, we will 
mostly use the expression real or complex Cartesian space when K = R or 
C. These are the only cases occurring in classical analysis. 


(ii) Tensor notation. When Albert Einstein began his work in general 
relativity, he learnt the hard way, with the help of his mathematician friends 
who had read the Italian literature on differential geometry, not to mix up 
vectors and linear functionals (and, more generally, to distinguish what are 
called covariant tensors from contravariant ones — see below), despite the fact 
that a vector has as many components as a linear functional has coefficients. 
Indeed, if there is basis change, by (1), the coefficients of a linear functional 
undergo the same linear transformation as the basis vectors, whereas the co- 
ordinates of a vector undergo a “contragredient ” one. This term is here used 
in in sense in which it is in the context of square matrices: the inverse of the 
transpose. For example in the simplest case, where the basis (a1,..., @n) is re- 
placed by the basis (t1@1,...,tn@n), with scalars t; 4 0, the u; are multiplied 
and the h’ divided by the t;. Hence, if, with respect to a particular basis, a 
vector and a linear functional appear to coincide because they have the same 
coordinates, this is not the case with respect to others; equating them has 
no physical and mathematical meaning. Anyhow, they are not objects of the 
same nature: a function defined on a set is not an element of this set. 
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Specialists of classical tensor calculus had adopted a system of notation, 
now mostly obsolete (including in my Cours d’algébre), that, nevertheless, 
had some advantages; it was based on inferior and superior indices and on 
the summation convention attributed to Einstein. 

In a space EF with a basis denoted by (a;), the first convention consists, 
at the simplest level, in writing vectors and linear functionals as follows: 


(1.4) b=) Ray ahj)= > uh 


where the h’ are the components of h and the u; = u(a;) the coefficients of u 
with respect to the basis considered. By (2), the second relation (4) can also 
be written u(h) = )> uja"(h), ie. 


(1.5) u=)> ua’, 


so that the u; are the coordinates of u € E* with respect to the dual basis 
(a’) of (a;) given by a’(h) = A’. In calculations involving both vectors and 
linear functionals, this notation, with its inferior and superior indices, makes 
it possible to immediately detect the nature of the objects discussed;° this is 
its first advantage. 

We generalize this notation to more complex objects, namely tensors. Ina 
finite-dimensional vector space FE, a tensor is a function T of several variables, 
some with values in FE, the others in E*, and satisfying the same multilinearity 
property as a determinant: if all variables except one are held fixed, we get a 
linear function of the remaining variable. This property generalizes constantly 
used calculation rules in elementary algebra: 


(ety)z=az+yz, (ta)z=t(az). 


Calculation rules for tensors are, therefore, the same as those for products, 
excepting commutativity. 

The function T is generally real or complex valued. T is said to be of 
type (p,q), or p times covariant and q times contravariant, if it depends on 
p variables in E and q variables in E*. A scalar is a tensor of type (0,0). 
A linear functional is a tensor of type (1,0). A vector h € E identified with 
a linear functional u + u(h) on E*, becomes a tensor of type (0,1). An 
euclidean scalar product (h|k)is a tensor of type (2,0). A linear map T : 
E —>+ E becomes a tensor of type (1,1) if the function T(h, u) = u[T(h)] is 
associated to it, and conversely. Given a tensor S(x,u) of type (1,1) anda 
tensor T(x, y,u) of type (2,1), the function 


(z Y; zy U, v) —> S(a, u)T(y, Zz, v) 


is a tensor of type (3,2), the tensor product S @T of S and T. This can be 
generalized in an obvious way to other types, provided the variables involved 
in both tensors are fully separated. 


3 Purists will reply that coordinates can be dispensed with. It does not seem to 
be the opinion of physicists. 
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The function 
(x,y, u) + S(a,u)T(y, x, u) 


is obviously not a tensor: the function x? is not linear on R or any other field. 
Determining the way in which a tensor depends on the coordinates of its 
variables with respect to a given basis (a;) is easy. For example, if T(h, k,u) 
is a tensor of type (2,1), then leaving out the summation signs with respect 
toi, j and p— this is Finstein’s summation convention, which I will mostly 
use —, 
T(h,k,u) =T (hia, k, u) = h'T (a;,k,u) = h'T (ai, kaj, u) a 

= h'k’T (a;,a;,u) = h'k!T (a;,0;,Upa”) = 

=e ust (a,03,07) 
and so 
(1.6) Tiki) = TE Rigs 
where the 


Ti; =T (ai, aj,a”) 


play the role of coefficients, components or coordinates — the terminology 

matters little — of T’ with respect to the basis considered. Conversely, any 

function given by a formula of type (6) is clearly trilinear in h, k, u. 
Relation (6) easily gives the transformation undergone by the coefficients 


Tk under a change of basis from (a;) to (bp) given by 


(1.7) bp = pia ’ 


where (},) is the transition matrix from the first basis to the second one. It 
suffices to note that for the corresponding dual bases, 


(1.8) b1 = Oa? with pi07 = dF. 


The well-known “Kronecker delta” equals 1 or 0 according to whether p and 
q are equal or not; indeed, by definition of a dual basis 


64 = bY (bp) = 4a! (pi,ai) = 041.5! = phot. 


Formula (6) then shows that 
(1.9) T (bp, bg; b”) = pp) 0,7 (a;,a;,a") ’ 


where the summation is over 7,7 and k. Conversely, it is easy to see that if 
we associated numbers Tf transformed according to (9) under basis change 
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to each basis (a;) for E, then the trilinear functional defined by (6) is inde- 
pendent of the basis (a;). 

The convention of placing some indices in an inferior position and others 
in a superior one can then be explained as follows. Given a basis (a;), choose 
two linear functionals f = f,a‘ and g = g;a’, a vector h = h‘a; and consider 
their tensor product f ® g @ h, i.e. the tensor 


S(x,y,u) = fla)g(yju(h) (a, ye E,ue E*) 


of type (2,1) like T. Its coefficients in the given base are the numbers 
Sti = f (ai) 9 (aj) a" (h) = figgh*. 


Formula (9), therefore, expresses that the coefficients of T transform like 
those of f ® g @ h; besides, by (6), a tensor of type (2,1) is clearly a linear 
combination of such products. Hence, the position of the indices immediately 
leads to formula (9). 

There are more complicated formulas, but in all cases within the ambit of 
tensor calculus, both sides are sums of monomials with indices, for example 


(*) AG = BPP Cl Diy ay MoknN¥? ; 


A formula of this type is a relation between the tensors A, B,C,D,M and 
N with coordinates or components in each basis, written Ajj,, etc.; it has 
relevance only if it holds for any basis. For this and for the formula to have 
a chance of being correct, it must satisfy the following conditions: 


(a) An index appearing only once in a monomial is a free variable on which 
the monomial considered depends; it must occur once and only once 
in the inferior or superior position in all the monomials of the relation 
considered in order to ensure that with respect to this index, all the 
monomials and hence their sum are transformed likewise ; 

(b) unless otherwise indicated, an index occurring twice in a monomial is a 
summation index, hence a bound or ghost variable on which the mono- 
mial does not depend; it must occur once in an inferior position and 
once in a superior position in order to ensure that, under basis change, 
linear transformations undergone by the monomials relative to these two 
indices cancel out ; 

(c) a summation index cannot occur more than twice in a given monomial 
and cannot occur as a free variable in any other monomials since then, 
by rule (a), it would also be occurring as a free variable in all other 
monomials. 


Exercise. Using formulas such as (9), show that the A, given in each 
basis by (*) are indeed the components of a tensor. 
Exercise. Let T be a tensor of type (5,3). Show that the numbers 


jJqr  _. ar 
Tiskpq = ikp 
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are the components of a tensor of type (3, 1). 

These conventions hold in (*) ; p and q are summation indices whose name 
matter little, which makes it possible to use the same summation index p in 
many different monomials, but does not allow the first term of the right hand 
side to be written as B??'CJ,, Dep since it is a sum over all couples (p,q), and 
not only over couples such that p = q. Similarly, if we multiply two sums, 
then we face serious problems if we write that 


se a,b’. » cd’ = pa a;b'c;d° : 
which is a higher level version of the immortal identity 
(a+ b)(e+ d) =ac+bd. 
The correct way to write this, especially if the }> are omitted, is 
a,b! .c;d! = a;b'c;d 


in accordance with the distributivity rule of multiplicity with respect to ad- 
dition: the i‘” term of the first sum is multiplies by the j“” term of the second 
one and we sum over all couples (7,7). As a basic precaution, this amounts 
to denoting the free or bound variables with different meanings by different 
letters. Similarly, a double integral is not written as ff f(z,x)drdz; it is 
written | f(x, y)dxdy; if the function f depends on an additional variable 
z, so that its integral with respect to x and y depends on z, it is written 
Jf f(x,y, z)dady; calling z the integration variable would lead to a com- 
pletely different result, namely [f f(z, y, z)dydz. 
Einstein’s convention aims at simplifying typography; for example 


i=p J=q 
chu; instead of ye clh'u; or of » iery 
ia i=1 j=l 


if the indices i and j vary within the indicated limits; besides, in general 
there is no ambiguity about this matter. As mentioned above, several math- 
ematicians now censor this way of writing on grounds that «this deluge of 
indices gives me seasickness », as used to say Dieudonné who was very sensi- 
tive to the latter (confirmed during a three day tempest in September 1950) 
and that considering mathematical objects themselves is anyhow better than 
considering their coordinates or components. This is undeniable, but forces 
to thinks and is often far less quick. 

In fact, tensor notation only applies in some very particular circum- 
stances — multilinear algebra and differential geometry — where it can be very 
convenient. I will, therefore, use them systematically in this chapter whenever 
general theoretical calculations will be involved, while giving the intrinsic for- 
mulas that allows coordinate calculations to be avoided; the reader will thus 
be able to compare both viewpoints. Moreover, if the deluge of indices which 
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I found amusing when eighteen still does not terrify me sixty years later, I do 
not see why I should deprive my readers of the pleasure of mastering them 
before casting them out into outer darkness instead of comparing the I "heh to 
an“ unpleasant insect” like Serge Lang, apparently impressed by Kafka, does 
in the preface of his Fundamentals of Differential Geometry... 

As mentioned earlier about Baire, before the war, the municipal library 
of Le Havre used to propose to its readers all the usual French textbooks and 
treatises of the time. It also had the complete collection of the Mémorial des 
sciences mathématiques, a short monograph series on the most varied topics, 
excepting almost all “modern” ones that were then being elaborated outside 
France. It was in general too difficult or too little interesting for me and 
I was anyhow sufficiently busy learning more directly useful mathematics. 
Nonetheless, one day I was stunned by the booklet on Absolute Differential 
Calculus by René Lagrange, professor in Dijon; the word “absolute” that 
had intrigued me had been introduced by the Italians who had perhaps read 
Balzac. It was a presentation of tensor analysis in Riemann spaces, a vaguely 
defined notion: it was more or less possible to understand that it entailed 
n-dimensional curved spaces and used curvilinear coordinate systems that 
could be changed at will by formulas involving only functions that were as 
differentiable as necessary; the square of the distance ds from a point x with 
coordinates (x) to an “infinitely near” point (xt + dz’) could be calculated 
by a formula of type 


ds* = gi;(x)dx‘dz! ; 


these spaces contained strange objects having an “absolute” meaning — what 
meaning? a mystery —, tensors represented in each coordinate system by func- 
tions of a variable point x in the space assigned inferior and superior indices; 
these were supposed to be transformed, under any change of coordinates, by 
formulas provided in advance involving the first derivatives of the coordinates 
with respect to the old ones; finally, height of virtuosity, the signs }> were al- 
ways omitted. All this was presented without any allusion as to what a curved 
space, a vectorial space, a linear functional or multilinear form, etc. were; in- 
ventors and users of tensor calculus were still in the position of a physicist 
engaged in vector analysis calculations (gradient, rotational, divergence, etc.) 
without knowing what a vector is. As will be explained below (n° 12 and 14), 
“tensors” of the time are just tensor fields, i.e. functions T that, indepen- 
dently from any coordinate system, associate to each point x of the “curved 
space” X a tensor T(x) of type (p,q) in the vector space X’(a) depending 
on x and having the same dimension as X — the “tangent” vector space of 
X at x, whose somewhat abstract definition will be given in n° 12 -, in the 
purely algebraic sense given to this notion above. It is, therefore, a general- 
ization of vector fields of physicists and mathematicians, that are just vector 
valued functions, i.e. tensor fields of type (0,1). But it was marvelous; only 
the machinery of traditional differential calculus needed to be known — essen- 
tially, the chain rule —, no other idea than that of constructing valid formulas 
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for all systems of curvilinear coordinates was involved. This only implied 
compliance to Einstein’s conventions. The “curvature” and “geodesics” of 
the space were calculated and finally, the tensor notation presented above 
was systematically used and led almost automatic calculations. In barely a 
month, I became an expert of absolute differential calculus without under- 
standing anything. I gained nothing at the time, excepting gymnastics, but 
it was not worse than spending two hours per day in front of the television, 
Flight Simulator or Tomb Raider being as yet unknown. 

There was also another exception in this collection: a far more modern 
and difficult treatise on La géométrie des espaces de Riemann, by Elie Car- 
tan. It contained little calculation and indices, but mainly abstract and some- 
what vague ides, for example the distinction between “closed” (i.e. compact) 
spaces and “open” (i.e. non-compact) ones. Formulations invented after 1945 
thanks to the clarification of the notion of a topological space, to the definitive 
crystallization of the theory of abstract differential manifolds,* in particular 
by Claude Chevalley,° were unknown at the time, and to the invention of fiber 
spaces by algebraic topologists partly inspired by Elie Cartan who, during the 
war, invented a broad generalization of tensors connected to group theory. In 
the 1950s, the theory acquire the perfect and abstract form that can be found 
in all presentations of the subject, starting with N. Bourbaki’s Fascicule de 
résultats. Naturally all this only concerns its most basic aspects and did not 
prevent its development in often unexpected directions too hard to present 
in the Bourbaki style — a maximum of abstractions and generalities. 


In the special issue on Bourbaki in the magazine Pour la Science, the 
French version, but not a translation of Scientific American, Benoit Mandel- 
brot is alleged to have said, p. 82, that he left the Ecole Normale Supérieure 
for the Ecole Polytechnique because thanks to my uncle,° I knew they were 
a militant gang, that they were strongly prejudiced against geometry and sci- 
ence, and that they tended to despise and even humiliate those who did not 
follow them; and the author of this declaration apparently left France for the 
United States (and the IBM) in 1958 because of their stifling influence. 

I am in a good position to appreciate the influence of Bourbaki on the 
Ecole normale in 1944. Between 1940 and 1953, the one and only member of 


4 i.e. that are no longer considered subsets of Cartesian spaces, as used to be the 


case. The most popular example is the general relativity space; most people find 
it hard to understand precisely because of this and the more so when it appeared 
that it could be “closed” or bounded, i.e. compact. Mystics, a species that is 
not endangered, wondered what there could be outside: dread is the feeling of 
nothingness (Heidegger). 

Theory of Lie Groups, Princeton UP, 1946. 

Szolem Mandelbrojt, professor at the College de France and a specialist of quasi- 
analytic functions of one variable, a difficult subject that did not acquire the 
importance of the great fields developed after the war; see chapter 19 of Rudin’s 
book. Szolem Mandelbrojt belonged to the initial Bourbaki group, but quickly 
left due to this very different conception of mathematics. This proves nothing 
against Bourbaki nor against Mandelbrojt. Everyone is free. 


a uo 
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the “militant gang” then in Paris was Henri Cartan, the others being in the 
provinces or in the United States. He was the only one to look after students 
who were then supposed to follow the classes of the Sorbonne and to prepare 
for the advanced degree known as agrégation. From this strategic position, 
he obvious and inevitably influenced them for about twenty years starting 
from 1940, the year when I went to the Ecole. Like everyone, he had his own 
conception of mathematics and propagated it, though in a less flamboyant 
style than M. Mandelbrot. 

Anyhow, before the 1950s, there were almost no other mathematician in 
Paris likely to generate enthusiasm among students from the Ecole seriously 
attracted to mathematics, and even less to explain to them any thing else 
apart from pre-1914 stuff.” Elie Cartan being too old did not teach any- 
more. The Lebesgue integral was sometimes taught by Arnaud Denjoy but 
his lectures were incomprehensible. As for Lebesgue, he preferred to teach 
elementary geometry at the College de France, an institution where, if I am 
not mistaken, professors are supposed to present recent subjects likely to be 
further developed; so it was Cartan who taught us in a very concise manner 
what a (Radon) measure and an integrable function were. Gaston Julia, a 
specialist of analytic functions reconverted to Hilbert spaces, was certainly 
around; provided one knew German, we could have learnt much more and 
much more quickly by reading some sixty pages by von Neumann in Mathema- 
tische Annalen of 1928-29 than by following his classes; even Henri Cartan, 
a non-specialist, used to teach us almost as much in a few lectures as Julia 
whose classes did not lead to any aspects of the subject subsequently devel- 
oped. Jacques Dixmier experienced this before converting with great success 
to von Neumann’s rings of operators,® a theory dating from the 1930s and 
seemingly unknown to Julia. I also followed lectures by Paul Montel, where 
he presented his theory of normal (i.e. compact) families of analytic func- 
tions, a 1910 model. We could also learn fluid mechanics from Henri Villat 
and Joseph Pérés, but the subject did not attract very many students from 
Ecole Normale. 

The only major exception was, shortly after 1945, Jean Leray, a very 
temporary member of the initial Bourbaki group which he used to criticize 
virulently on a personal plane, as I witnessed during a private conversation 
with him in 1950 in Cambridge, Mass, A specialist of partial differential equa- 
tions in fluid dynamics before the war, he was made a prisoner in June 1940; 
detained for almost five years in an officers’ camp in Austria where he had or- 
ganized some sort of university and not wishing Germans to benefit from his 


” And that is saying a lot since all that the German school had invented, from 
Gauss to Hilbert, in algebraico-analytic fields had been forgotten in France since 
at least fifty years. See the chapter on “Jeunes Turcs contre pontifes sclérosés ” 
in the Pour la Science issue on Bourbaki. 

8 Dixmier claimed this was due to me. It is true I had discovered them thanks to my 
habit, acquired in Le Havre, of opening books randomly, for example the volumes 
of Annals of Mathematics. Von Neumann’s papers had a great advantage for 
ignorant youngsters like us: they could be read almost without knowing anything. 
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expertise, he had converted to algebraic topology. Between 1940 and 1945, he 
invented the basic ideas of sheaf theory, which he expounded to Andre Weil 
and to Cartan shortly after his release. The theory was immediately adopted 
and greatly improved first by the former and then by very young people — my 
contemporary Jean-Louis Koszul, Jean-Pierre Serre and the Swiss Armand 
Borel, all members of Bourbaki before long. Sorry! —, and was the subject of 
a famous Cartan seminar talk in 1950-51. I wrote a book on it some years 
later? when Grothendieck, a member of the group and converted by Serre, 
was starting to revolutionize the subject and algebraic geometry. Elected to 
the College de France in 1947, Leray presented in 1947-48 and 1949-50 what 
was to become his major article on sheaves in the Journal de Liouville of 1950 
and as such certainly contributed to the education of a few people. In 1950, 
Leray returned for good to partial differential equations and obviously influ- 
enced some of the young people who chose this promising subject, though in 
the traditional meaning of the word, he had very few students. In any event, 
it is clear that had in 1944 Benoit Mandelbrot opted to come to the ENS 
instead of Polytechnique and followed Leray’s lectures from 1947 onwards in 
order not to fall into the hands of the militant gang, he would have learnt one 
of the most abstract and “modern” subjects. Anyhow, no one would have 
prevented him from choosing what suited him. 

In 1949, Gustave Choquet arrived in Paris. Whilst promoting in his un- 
dergraduate lectures, when he had the opportunity to do so from 1954-55 on- 
wards, a version of general topology that most members of the group never 
dared to diffuse at this level of abstraction, he was a major expert of fine 
structure theory, in line with Lebesgue, Baire and Denjoy and of potential 
theory together with Jacques Deny and, briefly, Henri Cartan. The inventor 
of fractals would have got on well with him had he not chosen Polytechnique, 
an institution where he probably learnt little else than traditional mathe- 
matics and the obligation of standing to attention when professors entered 
the lecture hall. The only advantage of Polytechnique being that one learnt 
other subjects, in particular physics, slightly more modern on some points — 
which was not hard — than at the Sorbonne. Besides, Polytechnique offered 
its best students much better career prospects and influence networks than 


° Tt was far removed from the mathematics I was involved in at the time. But a 
Bourbaki member is supposed to write down everything in the program of the 
groups, and, moreover, I was somewhat upset of constantly hearing discussions 
on algebraic topology, which I did not understand at all. When very temporarily, 
Bourbaki decided to prepare a book on the subject, I volunteered to write a first 
account of sheaf theory. Once the chapter was written and discussed together, it 
became clear that Bourbaki could not publish this type of mathematics before a 
long time, and I was advised to turn it into a book. It is quite different from my 
original report and still sells, Hermann publishers having recently reedited it in 
French without consulting me, when the necessity of an English version has been 
long obvious (the first one was quickly translated into Russian)... The moral 
of the story and of many others of this type is that the group represented for 
its members, in particular its youngest ones, a fantastic opportunity for learning 
mathematics. 
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what rue d’Ulm could propose to its scientists in those days. But no one 
would have prevented students in mathematics who so desired to for exam- 
ple take an interest in theoretical physics if only its mathematical aspects 
had been taught in Paris. During the war, in one semester of lectures at the 
Ecole — I have forgotten the year —, Louis de Broglie had not even gone as far 
as writing the Schrodinger equation on the blackboard. Physicists themselves 
learnt quantum mechanics on their own from German, American, English or 
Soviet writers while waiting for Messiah’s classes at the CEA (Centre d’étude 
atomique). 

During the era mentioned by M. Mandelbrot, within a few years, Henri 
Cartan had launched his students from the ENS in fields as diverse as poten- 
tial theory, topology of Lie groups, Lie algebras, homotopy groups, differen- 
tial topology, functions of several complex variables, etc. Some of his student 
from the Ecole normale chose subjects he knew little about, such as non- 
commutative harmonic analysis in my case (discovered from André Weil’s 
book and my lectures in the library of the Ecole or of the Henri Poincaré 
Institute), or nothing about, such as algebraic geometry in the case of Pierre 
Samuel, influenced by Chevalley in Princeton, not to mention those who chose 
the theory of trigonometric series like Jean-Pierre Kahane, probability theory 
like Gerard Debreux, etc. Generally speaking, everyone was perfectly free, on 
the understanding that, like everyone, Cartan did not take charge of those 
who chose fields he was completely ignorant about or that did not find in- 
teresting at all, or that were totally outdated: he directed them to others if 
they could be found. If four students graduating from the ENS deserved a 
CNRS (Centre nationale de recherches scientifiques) grant and if the CNRS 
only offered two, he obviously had to choose those he would support. 

By 1955 at the latest, students supervised by Cartan had many opportu- 
nities to learn from other “pure” or applied mathematicians. Serre’s lectures 
and talks at the College de France and those of Laurent Schwartz at the 
Henri Poincaré Institute, to mention only these two members of the Bour- 
baki group, met with prodigious success for several decades. The same is true 
for those of Jacques-Louis Lions a few years later. As a student of Schwartz, 
he could not initially avoid Bourbaki’s influence. 

Contrary to what Benoit Mandelbrot and numerous other critics seem to 
think, no one in the group was unaware of the existence of other important 
fields aside from “fundamental structures” or despised them if they were 
not outdated or too light; most of the original work of its members goes be- 
yond these. The idea that we held strong prejudices against geometry is odd 
from the part of people for whom Elie Cartan was the only French master 
and when Andre Weil was giving a solid foundation to the Italians’ algebraic 
geometry, with their “generic points”! Bourbaki took no interest in it as 
such precisely because they were not part of these structures, which kept it 
sufficiently busy: given our program, we would have aroused much mirth if, 
instead of starting the Eléments de Mathématique with set theory, algebra 
and general topology, we had directly embarked on PDEs, stochastic pro- 
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cesses, operational research, turbulent flows or the mathematics of quantum 
mechanics, all of which are subjects that we are told today , half a cen- 
tury later, suppose a new conception of mathematics, at the opposite end 
to ours. Incidentally, reading Dautray-Lions’s volumes or some recent talks 
at the Bourbaki Seminar is far from confirming this point of view. Around 
1948 together with Dieudonné and Schwartz, I attended some superb lectures 
by Jean Delsarte, a founding member of Bourbaki, on analytic number the- 
ory, following Hardy-Littlewood-Rademacher-Winogradov’s version ; though 
at the opposite of the Bourbaki spirit, the subject did not exactly arouse 
reactions of scorn from the audience. Dieudonné’s superb article on analytic 
number theory in the Encyclopaedia Universalis or, in a neighbouring field, 
my talks at the Bourbaki Seminar (1952-1953) on Hecke’s work (zeta func- 
tions of number fields, modular functions), the first of their kind in France in 
a field that had not yet been modernized, attest to this. One of the members 
of the group during this period, Charles Pisot, was a transcendental number 
specialist, a subject not very much in the Bourbaki line for the time. 

As for the other sciences, we had far less prejudices against geology than 
shown nowadays by M. Claude Allégre, a great specialist of the subject and a 
recent education minister, for our version of mathematics: we merely ignored 
him for obvious reasons. Generally speaking, we did not consider it our duty 
to provide experimentalists with mathematics in its traditional form, which 
they had most often learnt in their youth and were determined to preserve. In 
fact, their physics has become in many aspects as abstract as our mathematics 
and it is hard to see why we should have had to grant them the exclusivity 
of “modernism” 

The conclusion that follows from all this seems to me to be that it is above 
all to Henri Cartan and to the enthusiasm of the members of the Bourbaki 
group for “modern” mathematics that the French mathematical school owes 
its post-war recovery of the position it had lost since Poincaré, Picard and 
Lebesgue. 

Another rarely acknowledged factor needs to be mentioned. The weak- 
ness and isolation of the French school before 1939 is often explained by 
referring to the Great War, its massacres and to the hostility towards Ger- 
many that continued long after 1918 in particular because of Emile Picard. 
On the contrary, its renewal after 1945 is perhaps also due to the fact that, 
during the war, the French — including Leray in his Oflag, including people 
like Schwartz, Samuel and the young Grothendieck threatened by antisemite 
policies, including Weil and Chevalley in the USA or in Brazil — had nothing 
else to do apart from “real” mathematics at a time when their German, 
English, American, Russian, etc. contemporaries were engaged in doing work 
for war purposes of a far lower quality than they were capable of, like Polish 
Jews, were ending their lives in Nazi concentration camps. 

After 1945, mainly in the United States, but also in Japan, Germany, 
the USSR where an excellent school of functional analysis had flourished for 
a long time, many mathematicians who, at first, were unaware of even the 
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existence of Bourbaki but not of “abstract” and “modern” mathematics 
which it is far from having invented, returned or converted to this type of 
mathematics where the prospects were enormous; for example in algebra, 
van der Waerden’s books and later those of Birkhoff-MacLane have probably 
exerted a greater worldwide influence than those of Bourbaki; in fact, it is in 
van der Waerden that Koszul and I learnt algebra at the Ecole Normale. 

As I have explained in the postscript to vol. II, it is also thanks to the sup- 
port of the military and of industrialists after 1945 that a “militant gang”, 
far more influential than us and much more keen to propagate applied math- 
ematics was formed; the latter is now on the point of dominating our science. 

They are also starting to dominated the minds of some eminent pure 
mathematicians. When in the year 2000, “year of mathematics”, during 
a religious ceremony, a journalist questioned Alain Connes (Fields medal- 
list,College de France) he replied as follows according to Le Monde of 
25 May 2000: 

Euclid’s questioning was answered by research on non-Euclidean geometry, 
which stimulated Riemannian Geometry, which in turn inspired Albert Einstein 
for his work on space-time and general relativity used to refine the global 
satellite positioning system (GPS). 

It would not have been in vain to add that the GPS, like inertial guidance 
earlier, was developed by the American military for their own planes, ships 
and ground vehicles and for locating with extreme precision enemy objects; 
their intention was not to help walkers in the Fontainebleau forest or in the 
Great Erg nor to help taxi drivers navigate in Chicago suburbs. Without 
American military funding, no civilian firm would have began such an ex- 
travagant project needing hundreds of billions in investments and decades of 
technical progress regarding missiles, satellites and telecommunications. The 
fact that the GPS is now available in the civilian sector does not change any- 
thing to the fact that we ought to be able to find better illustrations of the 
usefulness of mathematics from Euclid to Einstein than the guidance system 
of the most terrifying armaments ever invented by humanity. Because this is 
what the GPS also continues to be 2001 even if, for the sake of the cause, we 
prefer, as always, to cover up that bosom, which we can’t endure to look on. 


2 — Differential Calculus of n Variables 


(i) Differential functions. Let f be a function defined on an open subset U 
of a n-dimensional real Cartesian space!® E with values in a p-dimensional 
Cartesian space F’, for example C if p = 2. Like in the case n = 2 recalled in 
chapter VIII, f is said to be differentiable at x € U if there is a linear map 
u: E —> F such that, for sufficiently small h € E, 


(2.1) f(a th) = f(x) + u(h) + o(h), 


1 Tn fact, almost everything applies to functions defined on and with values in 
Banach spaces. 
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the symbol o(h) denoting any function such that the ratio! |]o(h)||/||A|| ap- 
proaches 0 as h tends to 0. The linear map u is unique!” because, for a linear 
function, the relation u(h) = o(h) implies that u = 0: for any h € E, the ratio 
||u(th)||/||th|| must approach 0 as ¢ € R tends to 0 though it is independent 
from it. In (1), u is called the differential or the derivative of f at x, or else 
the tangent linear map to f at a. As it depends on the point z, it is written 
df (a) or f’(x), its value at a vector h being written, df(a;h) or f’(x)h. It is 
also given by 


d 
(2.2) f'(a)h = df(x;h) = value of rr +th) for t=0. 
And so, more generally, 

d 
(2.2’) df(x+th;h) = al + th) 


since the left hand side is the derivative of the function s + f[(x+th)+sh] = 
fla + (t + s)h] at s = 0. For any z, obviously 


(2.3) df(z;h)=f(h) and f'(c)=f if f is linear. 


Definition (2) is related to that of partial derivatives. If a basis (a;) for the 
vector space considered is chosen and if, x* generally denotes the coordinates 
of a point x, so that f(x) becomes a function of these n real variables, the 
partial derivatives of f at x are the vectors!® 


d 
(2.4) D;f (x) = value of Pr (c+ sa;) for s=0 


= df (x;a;) = f"(x)a; 
of F' obtained by differentiating the function 
(2.5) BS PG jay a bee pe) 


with respect to S' at s = 0. Having said that: 
(a) The existence of df(x) implies that of partial derivatives D; f(x) = 
df (a;a;) € F at x as well as the relation 


(2.6) df(a;h) = Dif(a)h', 


" Write ||h|| for the norm of a vector h defined by any reasonable formula. 

1 and, in the case of Banach spaces, continuous, since (1) shows that. ||u(h)|| re- 
mains bounded when h remains in a sufficiently small ball centered at 0. 

13 The notation D; indicates differentiation with respect to the it? variable. Its 
name does not need to be specified. Some authors write 0; instead of D;. There 
is, of course, also Jacobi’s notation 0/0x', which I will write d/dx’, a notation 
that can be easily typed and is self-explanatory. 
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where the scalars h' are placed on the right of the vectors D; f(x) contrary 
to tradition,!4 

(b) df(a) exists if and only the partial derivatives D;f exist and are 
continuous in the neighbourhood? of x (Chap. III, §5, n° 20). 

(6) reduces to numerical expressions by choosing a basis (b;) for F’ and 
setting f(x) = f?(x)b;, whence 


(2.7) Dif (w) = Difi(w)b, 

since the 6; are independent of x; as a result, 

(2.8) f'(a)h = df (a; h) = Dif? (z)hiby. 

The coordinates of the vector df(x;h) € F are, therefore, the numbers 
(2.9) df (a;h)? = Di fi(a)h' = afi (ah). 


This is the value at the vector h of the differential in x of the function f’. 
When the D; f(a) exist and are continuous on U, f is said to be of class 
C! on U, ete. 


(2.10) DD; f(x) = Dj Di f(z) 


if f is of class C? and in fact, using weaker assumptions (Chap. III, § 5, n° 23). 

The n x p matrix whose entries are the D;f?(x) is the Jacobian matrix 
of f at x. It is that of the linear map f’(x) with respect to the two chosen 
bases of & and F' since 


The rank (n° 1, (i)) of this linear map is the rank of f at «. As the deter- 
minants of the square sub-matrices of the Jacobian matrix are continuous 
functions of x if f is C1, if f is of rank r at x, then its rank is clearly > r in 
the neighbourhood of x. In the case of map from FE to itself, its determinant 
can be associated to f’(x). This is the Jacobian 


(2.11) J 7 (x) = det f(z) = det (Dif! (x)) 


of f at the point x; it was earlier called the functional determinant of the f; 
at x and was written D(f',..., f")/D(a',...,2”") or D(f)/D(zx) for short. 


Using the differential notation df = f’(«)dx as in the case of a real vari- 
able is possible and in practice very useful. Clearly, the differential of the 


4 Tt, suffices to set once and for all that th = ht if h is a vector and ¢ a scalar: there 
is no problem since the field R is commutative. 

15 In a topological space, a relation involving a variable x holds in the neighbourhood 
of a if there is an open subset U containing a such that it holds for all « € U 
(Chap. I, §1, n° 3 for the case of R or C). 
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coordinate function u’ : « ++ a is — see (3) — at each point of FE a lin- 
ear functional u’ : h +> h’. For a complex valued function f, the formula 
df (a;h) = D;f(x)h' can, therefore, also be written 


df (a;h) = Di f(«)du'(a;h). 


This shows that, the relation df(x) = D;f(x)du*(x) between df(z) and the 
du'(x) — linear functions of the ghost vector h — holds in the complex dual of 
E. But as du'(x) is in fact independent of x, it may as well be written du’ ; 
and as u’ denotes the function x +> 2’, its differential may as well be written 
dx’. So df = D;f.dx* for short. 

Leibniz would not have had any difficulties explaining that the differential 
of f at x is, as in the case of R, the increment of f when the variable x 
undergoes an infinitesimal vector increment dz : 


df (a;dx) = f(a + dx) — f(x) = f'(x)da. 


A priori this formulation is not well-defined, but it is often convenient to use 
it in order to quickly recover results. This is why physicists like it despite its 
metaphysical character. For example, the formula 


f(a +h.dt) = f(x) + f'(x)h.dt, 


which holds for a given vector h and an “infinitesimally small” dt, im- 
mediately shows that f’(x)h or df(a;h) is the derivative of the function 
tr f(a+th) for t=0. 

This can be justified provided the symbol dz is understood in a different 
way. The differential at any point of the identity map id: 7 +> & is just 
h+ h; since it does not depend on the point at which it is calculated, it may 
as well be written dx(h) rather than d(id)(x;h) as it should theoretically be 
the case. The expression df(#;h) or f’(a#)h can then be written df [x ; dx(h)] 
or f'(x)da(h). So, the expression df(x;dx) = f'(a)dx is shorthand for the 
differential of f at the point x. This is all in appearance quite subtle, and in 
reality tautological, but it is sometimes useful in order to intuitively under- 
stand the formulas; the following point will illustrate this. 


(ii) Multivariate chain rule. This is the formula that allows first order 
differential calculations to be reduced to linear algebra calculations; its im- 
portance cannot be overstated. Let E,F,G be Cartesian spaces, U and V 
open subsets of E, and F respectively, f : U —+ V and g: V —> G two C! 
maps, and let us consider the composite map 


p=gof:U—>G. 
It is also a C! map and, for any h € E, 


(2.12) dp(x;h) = dg | f(x); df (a; h)] 
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or, with a different notation, 
(2.13) p(x) = 9' [f(z)]o f(x), 


i.e. the composition of the tangent maps f’(”) : HE — F and g'[f(«)|: F — 
G to f and g at x € U and f(z) € V. This is the theorem for finding the 
derivative of a composite function. It will constantly be used in this chapter. 
In Leibniz’s notation, (13) is written 


(2.13”) dp(x;dx) =dg(y;dy) with y= f(x), dy= f'(x)dz. 


(13) can easily be remembered by observing that it is the one and only 
conceivable formula having a meaning given the degree of generality of the 
situation : we want to find a linear map p'(x) : FE —+ G, and as only the linear 
maps available are f’(x) : E —> F and g’(y) : F —> G, their composite 
g'(y) ° f(a) : E —+ G is the possible candidate. Moreover, as the result 
sought should not depend on x, y must depend on it. The only possibility 
proposed by the data being the substitution of y by f(x), (13) follows. This 
is the beauty of “intrinsic” or “absolute” arguments: the formulas to be 
proved are imposed by the very nature of the objects considered (and they 
are correct). Leibniz would have explained that 


p(x) + p'(x)da = p(x + dx) = g [f(x + dx)] = g[f(x) + f’(x)da] = 
= gly + dy) = g(y) + 9'(y)dy, 


where y = f(x) and dy = f’(x)dz; hence (13’). Though this argument makes 
no sense if dx is interpreted in the same manner as the author of Théodicée, 
it leads as easily to the result in the general case as for functions of only one 
real variable. The important thing is not to become puzzled to the point of 
thinking, like Leibniz and the physicists of past times, that this calculation 
is a genuine proof.'® 

When F = F = G in the above, the Jacobians (11) of f,g and p can be 
considered. The theorem on products of determinants and (13) then show 
that 


(2.14) Igop (x) = Iq [f(a)] Jp (a) - 


(12) can be written is an explicit form in terms of numerical functions. 
Choosing a basis (b;)1<;<p for F’, a basis (cx)1<k<q for G and setting 


(2.15) f(x) = fi(x)boy, gy) = 9*(y)ew, p(x) = p*(a)cx 


'6 Naturally, physicists have a different conception of proofs from mathematicians: 
if sloppy mathematical arguments provide them with the formulas confirmed by 
their experiments, for them, the formulas have been proved. 
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with numerical valued functions f/(x), g*(y) and p*(x), we get 


(2.16) p*(x) = g* [f(a] . 

On the other hand, relations (6) and (8) show that, for h € E, 

(2.17) ap(a; h) = dg [ys df (a; hj] = dg (ys by) Af(a; hy? = 
= Djg"(y)cx-Dif? (x)h' 

Since by (8) applied to p, dp(a;h) = Dip" (x)h'cp, 


(2.18) Di {9" [f(2)]} = Dip" (w) = Dif? (x).Djg"(y) where y = f(z) 


finally follows. This expresses (13) in terms of matrices: the Jacobian matrix 
of p at x is the product of the Jacobian matrix of f at x and that of g 
at y = f(a). The left hand side of (18) denotes the effect of the operator 
D; = d/dz' on the function «4 g*[f(zx)], not to be confused with Djg*[f(x)], 
which is the value of the function Djg” at f(a); this value is not usually 
well-defined since the function g* depends on y and not on x. For example, 
the second expression, D;p*(«), is the value of the function Djp* at x, and 
Dj;g*(y) the value of the la fonction!’ Djg* at y, where Dj = d/dy’. The 
presence of a punctuation point in the expression Dj f/(x).Djg*(y) indicates 
that the operator D; applies only to f/(x) and not to f4(x)Dj;g*(y). These 
conventions will be systematically used in order to avoid confusion. 

These formulas can be simplified if the function g is real valued; this is 
then also the case of p and we get 


(2.19) Di {9[f(x)]} = Djg [f(@)] Dif? (x). 


In particular, if (case E = R) there is map, written t > y(t) rather than f, 
from an interval of R to the domain of definition V of g. So setting D = d/dt, 


(2.20) D {g [(t)]} = dg (u(t); u(t) = Djg [n(t)] Du’) 
at each point t where Du(t) = p(t) € F exists. 


(iii) Partial differentials. Seemingly more complicated composite func- 
tions often need to be considered. For example, 


(2.21) p(u) = glx(u), y(u), 2(w)] , 


where wu varies in an open subset (2 of a Cartesian space, where the functions 
x,y,z map 92 to open subsets U,V,W of three Cartesian spaces EF, F,G and 
where g is defined on the open subset U x V x W of the Cartesian space 


17 Generally, the symbol D;f always denotes the partial derivative of the function f 
with respect to the i’” variable on which it depends, irrespective of the letters 
used to denote them. 
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Ex Fx G. To compute dp, introduce the partial differentials of the function 
g(x,y, z) with respect to x,y and z. The partial differential dig[(x;h), y, z], 
which depends linearly on an additional variable h € EF, is obtained by fixing 
y € V and z € W and by differentiating the map «+> g(a, y,z); hence, by 
definition, 


(2.22) dig ((a:h),y,2] = Zole+th,y,2) for t=0 


or, in Leibniz style, 
(2.22’) dig {(v;dx),y, 2] = g(x + dx, y, z) — g(2,y, 2). 


The differentials dgg[x, (y;k), z] and dgg[x, y, (z;1)] are similarly defined, the 
letters & and | denoting vector variables in F' and G. As g is defined on an open 
subset of E x Fx G, its (total) differential depends on a vector varying in this 
space, i.e. on three vectors h € E,k € Fl © G. By definition, it is obtained by 
considering the value of g at the point (x, y, z)+t(h, k,l) = (a+th, y+tk, z+tl) 
and by differentiating the result at t = 0: 


(2.23) dg |(z,y, 2); (h, kD] = “ola +th,y+tk,z+tl) for t=0 


(2.24)  dg[(z,y, z); (h, k, 1)] = dig [(x; A), y, 2] + dog [x, (y;), 2] + 
+ d3g |x, y, (2;0)] 


is easily seen to follow. The left hand side is indeed a linear function of the 
vector (h,k,l) € Ex Fx G; as 


(h, k,l) = (h, 0,0) + (0, %,0) + (0,0, 2), 


it is, therefore, equal to dg|(x, y, z); (h,0,0)|+ etc. But, by definition, 


dg (x,y, 2);(h,0,0)] = Zal(e.y,2) + t(h,0,0)] = 


= £ g(a + th,y,2) for t=0, 
an expression equal to d,g[(x;h),y,z)] by definition. This gives (24). As an 
aside, the following mistake should be avoided: the left hand side of (24) is a 
linear function of the vector (h,k,1) in the vector space F x F x G, and not 
a trilinear function of the vectorshe E, ke F,1LEG. 

For example, suppose that, as in the case of a tensor, g(x, y, z) is a trilinear 
function of x,y,z. Since a ++ g(x,y, z) is linear, by (3), its differential d,g is 
just dx + g(dz, y, z). Hence 


(2.25) dg = g(dx,y, z) + g(a, dy, z) + g(x,y, dz) 
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or, more explicitly, 


(2.25’) dg |(x, y, 2); (h, k,1)] = g(h, y, 2) + 9(@,k, z) + g(a, y,1). 


More specifically, suppose that the variables x, y, z are n xn matrices or linear 
operators on a Cartesian space and that g(z,y,z) = xyz; we then get the 
formula 


dg = dz.yz + xdy.z + xydz, 
where the order of the terms need to be carefully respected; more explicitly, 
dg [(, y, 2); (h, k, 1)] = hyz + vkz + xyl 


where, like x,y,z, h, k,l are matrices or linear operators; the spaces E, F,G 
in (24) are here identical to the vector space £(M) of linear maps from 
to itself. 

Having done this, we can return to the computation of the differential 
of the composite function p(w) = g[x(w), y(u), z(u)] we started with. By the 
multivariate chain rule, it can be obtained by replacing in dg, the variable 
(x,y,z) and its differential (dx,dy,dz) by their expressions in terms of wu; 
and so, without assuming g to be multilinear, 


dp(u; du) = dig {[x(u);x 
[y(u); y’ (u)dul , z 


Having done that, replace du by h. If g is multilinear, the result simplifies by 
(25) : 
(2.26) dp(ush) = g[x'(u)h, y(u), 2(u)] + 9 [2(u), (uh, 2(u)] + 

+ g[x(u),y(u), 2(u)h] . 
If, moreover, the variable wu is real, in which case so is h as well, then, because 
of the multilinearity of g, h becomes a common factor, and since dp(u;h) = 
p'(u)h we get 
(2.27) p'(u) = g[x'(u), y(u), 2(u)] + 9 [2(u), y'(u), 2(u)] + 

+ g[z(u), y(u), 2’(u)] 
as if it was a matter of differentiating a product x(u)y(u)z(u) ; this is the key 


point. Besides, if it is a genuine product of functions whose values are, for 
example, linear operators or n x n matrices, we retrieve the classical formula 


 x(u)y(u2(u) = x'(u)y(u)z(u) + 2(u)y"(u)2(u) , +2(u)y(u)z’(u) 


where, once again, the order of the factors is essential. 
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(iv) Diffeomorphisms. Let U be an open subset of a Cartesian space F and 
f amap from U to a Cartesian space F of the same dimension as E’. Suppose 
that f maps U on an open subset of F’. f is said to be a diffeomorphism of 
class C? from U to V if f as well as the inverse map g = f~! : V —> U are 
bijective and of class C?. Clearly, for y = f(x), the linear maps f’(x) and 
g'(y) are mutually inverse. If E = F’, then 


(2.28) Ig(y)Js(x)=1 for y=f(z), g=f" 


follows and in particular, J;(~) £0 for x € U. 

Conversely, let us start with a C? map f from an open subset U of E 
to F, with dim(#) = dim(F), and suppose that f’(#) is invertible for all 
x € U. The local inversion theorem (Chap. III, §5, n° 24, Theorem 24), 
whose proof in dimension n is similar to the one in dimension 2, tells us that 
for all a € U, there is an open neighbourhood U() of « homeomorphically 
mapped by f onto an open neighbourhood V of y = f(a), the inverse map 
from V(y) to U(x) also being C?. If that is the case for all x € U, then 
the image V = f(U) is open, and more generally so is the image of any open 
subset of U. If, moreover, f is injective not only in the neighbourhood of each 
point, but globally, and so is a bijection from U onto V, then the inverse map 
f-!|:V —U can be considered; it is C?, so that f is a diffeomorphism. 

When we will prove the change of variable formula for a multiple integral, 
we will need to consider a bounded open subset U of a Cartesian space E, 
its compact closure A and a map f from A to F or, more generally, to a 
Cartesian space F’. f will be said to be of class C? on A if f is of class C? on 
U and if the partial derivatives of order < p of the f? extend to continuous 
functions on A. Since A is compact, this is the case if and only if these 
derivatives are uniformly continuous on U (Chap. V, §2, n° 8, corollary 2 
of Theorem 8 generalized to n variables). So the traditional notation for 
extensions to A of partial derivatives will continue to be used for the vector 
D; f(a) with coordinates D;f?(a), for the linear maps f’(a):h+> D;f(a)h' 
and the Jacobian J;(a) will be defined in the obvious manner for all a € A. 

When dim FE = dim F, f will be said to be a diffeomorphism of class C? 
from A onto B = f(A) if, moreover, (i) f is bijective, in which case f is 
a homeomorphism from A onto the compact set B = f(A), (Chap. III, §3, 
n° 1, Theorems 11 and 12), (ii) f is a C? diffeomorphism from U onto an 
open set V C F, and so B = V, (iii) the inverse map f~! : B —+ A is C? in 
the sense defined above. 

Conditions (ii) and (iii) require 


(2.29) J;(a) #0 forall ac A 


since the two sides of (28) are continuous functions on A. Conversely, if 
(29) holds, (ii) follows by (i) and the local inversion theorem, and (iii) is a 
consequence of the fact that, if M(x) is the Jacobian matrix of f at x € U, 
the entries of its inverse, i.e. the derivatives of f~', are the quotients of 
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minors of M;(x) by J’(x) (Cramer formulas), and so extend by continuity 
to B when (29) holds. 

The situation described above is encountered when f is the restriction 
to A of a diffeomorphism defined on an open set containing A, but the con- 
verse is most dubious: for a function defined on a compact set A to have a 
C' extension on an open set containing A, it must satisfy more restrictive 
conditions than those imposed above.!® 


(v) Immersions, submersions, subimmersions. In section (iv), f was as- 
sumed to map an open subset U of E to a Cartesian space of the same 
dimension as F and the tangent maps f’(2) were assumed to be invertible. 
This assumption has no meaning anymore if dim(F’) 4 dim(/) and even if 
dim(£) = dim(F’), the important notion is that of the rank of f at x, defined 
in section (i), ie. the dimension of the vector subspace Im f’(a) = f!(a)E 
of F. It can be written rg,(f); as it is computed by using minors of the 
Jacobian matrix, it is a lower semi continuous function of x: for any M, 
the relation rg;(f) > M defines an open subset of U. If rgz(f) = dim(£), 
ie. if f’(x) is injective, f is said to be an immersion at x; if, conversely, 
rgx(f) = dim(F), ie. if f’(x) is surjective, f is said to be a submersion at x. 
If, more generally, the rank of f is constant in the neighbourhood of x, f is 
said to be a subimmersion at x. These notions will arise later in relation to 
submanifolds of a Cartesian space. 


3 — Calculations in Local Coordinates 


(i) Diffeomorphisms and local charts. The notion of a diffeomorphism is sim- 
ilar to that of a curvilinear coordinate system (meaning: global) in an open 
subset U of an n-dimensional Cartesian space E. Such a system is a family 
of n functions y’ : U —+ R of class C',at least such that the differentiable 
functions on any open set V C U are precisely those that can be expressed 
in a differentiable manner by using “coordinates” y'(x) = €' de x; more 
precisely, the map 


pine + (2°) js soa) 


from U to R” is required to be a diffeomorphism from U to an open subset of 
R”. In what follows, we will write « = f(€), or sometimes x(€) if no confusion 
follows, for the point of U corresponding to point € € U’ = y(U), so that the 
map f : U' —+U is the inverse of y; then 


(3.1) F() =e (x), 
the two sides being calculated at corresponding points x € U and € € U’. 


18 See for example Dieudonné, Eléments d’analyse, vol. 3, XVI.4, problems 4, 5, 6 
(Whitney’s theorems), which deals with the case of C” maps. 
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At least among mathematicians, the expression “curvilinear coordinates” 
has long fallen into disuse. In a Cartesian space E, the term chart is preferable 
for any diffeomorphism y from an open set U C FE onto an open subset of 
a Cartesian space; (U,y) will denote such a chart. If a € U, (U,y) is also 
said to be a local chart of E at the point a; it is often convenient to assume 
p(a) = 0. 

The following result immediately shows the usefulness of local charts: 
Theorem 1. Let f be a R4-valued C*® map defined in the neighbourhood of 
0 in R? and such that f(0) = 0. Suppose that the rank r of f is constant in 
the neighbourhood of 0. Then there are C® local charts (U,y) of R? at 0 and 
(V,w) of R4 at 0 such that 


wofog t(€) = (eg h Oa.) 


for any& EU. 


An equivalent formulation: setting y = f(x) and denoting by €’(1 <i < 
p) the coordinates of € in the chart (U,y) and by 7/7 (1 < j < q) those of y in 
the chart (V,w), 


G=O(<s7sr), P=0Grl spa. 


To prove the theorem, first note that, up to a permutation of the canonical 

coordinates in R? and R¢, DP occas d PDE, 

.., 2") # 0 may be assumed to hold at 0, hence also in a neighbourhood 
of 0. Then consider the map 


Cle xtege) = (FO aasee (Ee oa) 


from the latter to R?. Its Jacobian matrix is of the form 


A 0 
D dy 


where A is that of f',...,f" with respect to !,...,a7. It is, therefore, in- 
vertible like A in the neighbourhood of 0. As a result, y is a diffeomorphism 
from an open neighbourhood U of 0 to an open neighbourhood U’ of 0. So, 
we get a chart (U,~) of R? at 0 for which 


(3.2’) Foe (=e se Cees) 


with new functions g'(r + 1 < i < q) defined on U’. Denoting the Jacobian 
matrix of g"t!,..., 9% with respect to €"t1,...,€% by D, that of (2’) is of the 
form 
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As its rank is by assumption r, all the entries of D are zero.!® This mean 
that, in the neighbourhood of 0, the g’ only depend on the first r variables 
é'. The g' being defined in the neighbourhood of 0, the expression 


(3.2”) 
vy) = Care ee —9 (yey). yt — 9! GF xsi) ) 


is well-defined for all y € R? in the neighbourhood of 0. The Jacobian matrix 


of w is of the form 
ile ? 
( 0 lner 


and so is invertible. As a result, ~ is a diffeomorphism from a neighbourhood 
of 0 onto an open subset V’ of R¢, giving a chart (V,w) of R¢ at the origin. 
We can obviously assume f(U) C V and then compute the map 


wofog!:U'—>V' 


which gives f in the charts that have been obtained. This amounts to replac- 
ing y,...,y% in (2”) by the expressions €1,...,9%(€) occurring on the right 
hand side of (2’), which replaces y’** — f"t*(y?,...,y") by f7**(€1,...,€") — 
frrle,...,€) =O; hence 


pet om * (Ela (S i csigh Myida gO) F 


proving the theorem. 


(ii) Moving frames and tensor fields. Consider a chart (U,y) and the 
inverse f : U' —+ U of the map y. For any € € U’ = ¢(U), the tangent map 
f’(€) transforms the canonical basis (e;) of R” into a basis (a;(€)) of E which 
depends both on the point « = f(€) and on the chart y, namely 


(3.3) ai(€) = f’(E)ei = df (Ee) = Dif (€). 


This is the partial derivative of the map f : U'’ —> E at the point € = y(a) € 
R”. So, by (1), 


(3.3’) g' (x)ai(€) =e. 


For example, denoting by (,y) the usual Cartesian coordinates and set- 
ting U to be the open subset R? — R_ of R? already encountered in Cauchy 
theory, set 


x=r.cosé, y=r.sind with r>0O and |0| <7. 


'® Calculate the determinant of order r+1 obtained by adding a row and a column 
to the matrix 1,. 
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Then the map ¢ is (x, y) + (r,@) and it is a diffeomorphism from U onto the 
open subset of R? defined by the inequalities imposed on r and 6; its inverse 
map is f(r,@) = (r.cos0,r.sin@). Then, to say that a function p defined on 
an open set G C U is of class C1! means that 


p(x, y) _ PU 9) 


with a C' function P defined on the open subset y(G). The differential of 
f is (cos6.dr — r sin .d0, sin 0.dr + r cos @.d@) and, for € = (r,@), the vectors 
a;(€) are obtained by replacing (dr, d@) either by (1,0) or by (0,1); hence 


ai(€) = (cos6,sin 6), 
a2(€) = (—rsin@,rcos6). 


Exercise. In R? — {0}, use spherical coordinates defined by 
zZ=rcosy.cosé, y=rsiny.cosé, z=rsiné, 


with r > 0,0 < » < 27,|6| < 7/2. (It is not exactly a diffeomorphism onto 
an open set, but this does not matter). Calculate the a;(&). 

The basic idea of classical tensor analysis is to use for all calculations at 
the point x = f(€) what Elie Cartan called a moving frame (a;(€)), instead 
of a basis for E chosen once for all. This amounts to regarding a function 
of « € U as a function of the corresponding point € = y(x) of U’ and to 
calculating in the canonical basis of R"; we will see (§4) that this is also 
what we are forced to do in “curved spaces”, i.e. the differential manifolds of 
the modern theory, since they are not contained in a Cartesian space whose 
basis we could choose. 

For example, let us consider a vector h € E and calculate its components 
h'(€) with respect to the basis a;(€) = f’(€)e; attached to the point x = f(€). 
Set v(x) = y'(x)e;, so the coordinates of x in the chart (U, f) are €' = y*(z); 
taking (3) and f’(€) = y’(x)~! into account, 


h= f'(€)e'(a)h = fi (E)dp(a;h) = f'(E)dy' (a; hye: = dep! (x; h)ai(€). 
This gives the coordinates h'(€) = dy'(x;h) sought and the relation 
(3.4) h = dg’ (a; h)ai(€) 
that Elie Cartan, following Leibniz, used to write as 
(3.4’) da = a;(€)dé , 


the expression dé’ in these formulas being the differential of the 7” “curvilinear’ 
coordinate of x ++ €' = y'(x) considered a function of « This relation merely 
expresses the fact that the a;(€) are the partial derivatives D;f(€) with re- 


spect to the coordinates €* of the inverse map x = f(€) of y. 


bi] 
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In fact, Elie Cartan used more general moving frames than those defined 
above from local charts; for him, it was merely a basis (a;(a)) of H depending 
on a point z € E and whose vectors were functions of x as much differentiable 
as necessary. We will return to this a bit later in our discussion of differential 
forms, where it is essential to understand their calculations. 


The notion of a local chart allows us to understand Rene Lagrange’s (i.e. 
Ricci’s and Levi-Civita’s) “tensors”, mentioned in n° 1, (ii), and foremost in 
the case of a Cartesian space E since that is all we know until further notice. 

Suppose that there is a tensor field on an open subset X or FE, for example 
a function T(a;h,k,u) that is multilinear in h,k € E and u € E*, for all 
x € X. Consider a chart (U,y~) with U C X. Write f for the inverse map of 
and for « = f(€), let (a;(€)) be, as above, the image basis under f’(€) of the 
canonical basis of R” ; write (a’(€)) for the dual basis of E*. For « = f(€) € U, 
let 


(3.5) Ti5(é) = T [@; ai(€), a;(€), a*(€)] 


be the components of the tensor IT (x) : (h,k,u) > T(a;h,k,u) with respect 
to the basis (a;(€)) of LE. For the founders of tensor calculus, a tensor was 
simply a system of components (Ti (€)) attached to each point x and to 
each local chart. To understand the change of coordinate formulas that these 
components (5) are subject to, the big question is how to calculate the vectors 
a;(€) occurring in (5) in terms of similar vectors with respect to another chart. 
The problem being local, the latter can be assumed to be defined in the open 
subset U, hence of the form (U, w), where w is another diffeomorphism from U 
onto an open subset of R” . Hence every x € U has two types of coordinates, 
namely the points € = y(#) and 7 = ~(ax) of R”, and at each point 7 € U 
there are two frames; write 


(3.3”) ai(€) = f’(§)ei and ba(n) =9'(n)ea 


for these moving frames, using Roman (resp. Greek) indices in the first (resp. 
second) chart, following the author of Absolute Differential Calculus. As 
and w are diffeomorphisms from U onto open subsets V and W of R”, there 
are pairwise inverse diffeomorphisms 


d:V—>4W, p:WoYV 
such that 
w=O00p, p=poy, 
g=fop, f=go8; 


this means that the coordinates €' = y’(x) of a point x in the char (U, y) and 
its coordinates 7° = w(x) in the chart (U,w) are connected by the formulas 


GO (E cege™) y ap (i aang") 
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The relations 

O(E)ei = OF (E)ea, ea = py (n)ei 
also hold. Their coefficients 
(3.6) O(€) = DiO"(€), p(n) = Dap'(n) 


are the partial derivatives of the changes of coordinates. 
Having said that, 


(3.7) P(z) =O (E)oy'(x), v(t) =p'(nov(z), 
(3.7’) J(m=fOeoe(n, fF) =g9'n)oH(€). 
Using (3”), 


(3.8’) ai(€) = f’ Ser = g'(n) 0" (E)ex = 9'(n) OF (Sea = OF (E)ba(n) 
and similarly 
(3.8”) ba(n) = pa(n)ai(€) - 


It remains to show how the covectors a*(€) occurring in (5) transform. Now, 
we know that in a vector space, if the basis (a;) is transformed into to the 
basis (ba) by ba = c’,ai, then the dual basis (b%) is transformed into the dual 
basis (a‘) by a’ = c',b%. Hence here, 


(3.9") ak (€) = pk 
(3.9) b*(n) = 08 (E)a*(E). 


Having done that, transformation formulas for the components (5) of the 
tensor field T follow immediately by applying relation (1.9) to the trilinear 
form T(x); clearly, 


(3.10) To a(n) = py (aoa (nOr (TEE). 


These are the mysterious transformation formulas of tensors into curvilinear 
coordinates that the founders of the theory used to state without proof. Note 
that these calculations respect the tensor calculus rules formulated in (ii) of 
n° l. 

Exercise. Let F be a C! function on EF; show that the functions p;(€) = 
DAF f(€)]} are the components of a tensor field of type (0,1) by verifying 
that they satisfy the transformation formula (10) for type (1,0). [The tensor 
field in question is obviously the function (x, h) + dF (a; h)]. 

Note that in these calculations, the fact that the moving frames (a;(&)) 
and (ba (7)) are associated to charts is of little importance; in all cases, formu- 
las similar to (8’) and (9’) and hence to (10) exist, using local charts has no 
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other purpose than to express the coefficients p and @ as partial derivatives 
of the change of curvilinear coordinate formulas. 


(iii) Covariant derivatives on a Cartesian space. A “differential” or a 
“covariant derivative” can be assigned to any C! tensor field T. It is also writ- 
ten T or, in classical Riemannian geometry, V7’, where the sign V, “nabla”, 
supposed to come from Pharaonic Egyptian, must have been chosen by the 
inventors of tensor calculus to highlight its esoteric character; this operation 
takes a vector field of (p,q) to one of type (p+ 1,q). When calculating in a 
chart, a first idea that comes to mind is to go, for example, from a tensor 
field T of type (2,1) whose coefficients (5) depend on the coordinates €' of 
the point x, to a tensor field of type (3,1) whose coefficients are reportedly 
the functions DiTP.(E); where D; = d/dé'. This is a bad idea because dif- 
ferentiating formulas (10) would give rise to second derivatives of the € with 
respect to the 7, which would not lead us to the components of a tensor. It 
is better, as always, to argue geometrically: since T' associates the trilinear 
function (k,l,u) + T(a;k,l,u) to every « € U, a quadrilinear function can 
be deduced by differentiating with respect to x for given k,/,u; this is the 
covariant derivative T’ of the tensor field T’ defined by the formula 


d 
(3.11) T'(a;h,k,l,u) = Gl be + ths k,l,u) for t=0, 
= dT [(a;h), k,l, ul 


in accordance to (2.22), with h,k,l €¢ E and u € E*. The result is easily 
calculated by using a basis (a;) for E independent of x . Indeed, then 


T(a;k,l,u) = Tha (x) kPltu, 
with coefficients Teh?) = T(x; ap, a,,a"), and so obviously 
T’(a;h,k,l,u) = dT, (a; h)kPltu, = D;Ty (2) hi kP itu, ; 


where D; = d/dx’. Therefore, in this case, the coefficients of T’ are indeed 
the partial derivatives of the coefficients of T with respect to the Cartesian 
coordinates of x. But as we want to use the variable basis (a;(€)) for all 
calculations at the point « = f(€), we have to argue differently. 

Applying the definition of T’ for x = f(€), h = a;(€), k = ap(€), 1 = a, (€) 
and u = a’(€), we need to calculate”? 


(3.12) ViTpq(§) = I" |x; ai(€), ap(€), aq(€), a” ()] = 


d r 
= Gl |e + tail$); ap (€), ag(S), 4 (€)] 


20 The traditional notation V;T, ae indicates that we can go from the components of 
T to those of T” by “partial covariant differentiations” similar but not identical 
to classical partial differentiations with respect to the variables ¢’. 
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for t = 0. For this, let us differentiate the function 


Tyq(S) = T [F(E); ap(€), @q(€), 2" (8)] 


with respect to € by applying the general formulas (2.25) and (2.27) stated at 
the end of n° 2. The result is the sum of fours partial differentials with respect 
to the four variables on which T (a, h, k, u) depends, it being understood that, 
in these differentials, the functions f(€), ap(€),... and their differentials with 
respect € to will have to be substituted to x,h,k,u and to their differentials, 
as if we had to differentiate the product f(€)ap(€)aq(€)a’ (€). Differentiating 
T(a;k,h,u) with respect to x, by definition, we get d\T[(a;dx),h,k,u] = 
T’(a;dx,h,k,u) by (11); the first term of the result sought is, therefore, 


T" [f(€); AF (§)s @p(€); @q(S),0"(E)] = 1" [F(E); P(E) ag, ap (E), aa), 0" (E)] - 


In the three other terms, we differentiate a linear function; hence it is suf- 
ficient to replace the variable in T, for example, a,(€), which depends on €, 
by its differential da,(€;d&). Replacing dé by the variable h € R” on which 
the differential of a function of € € R” depends, we, therefore, finally get 


AT 54 (5h) = T" [ws df (; h), ap(E), aq(€), a" (€)] + 
+ T |x; dap(£;h), @q(€), a" (€)] + 
+ T |x; ap(€), dag(;h), a"()] + 
+ T [x; ap(§), aq(€), da" (&; hr) - 


As we are interested in coefficient (12) of T’, in the above, we need to take 
h = e; since df(€;h) = a;(&) on the left hand side of the previous formula. 
day(é;e;) = Djap(€), etc. remain to be expressed in terms of the a;(€) and 
aJ(€) themselves ; for this, set 


(3.13) Dap (€) = T,(€)aj(€), Dia? (€) = AP, (Ea (€) 


with numerical coefficients to be determined, i.e. the Christoffel symbols. In 
view of (12) and the multilinearity of T, we get 


Di Te, (€) = ViTog(€) + FZ, (©)Tyy(€) + L(OTH; (6) + AG (OTL(O) 
and so, omitting the €, the formula 
(3.14) Viti, = Dill, — TET, - 12,05 - AGT! 


Note that, since a;(€) = f’(€)e; = Dj f(§),”" 


Diya; (€) = DiDjf(€) = Dj Dif (€) = Djail€), 


21 If f is C?, a harmless assumption in a context limited to quasi-formal calcula- 
tions. 
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and so 
k k 
(3.15) Li; = I; . 


On the other hand, 


for the following reason. Set 
u(h) = B(h,u) for he FE and ue E*. 
This gives a bilinear function of h and u, so that 
D {B[h(g), u(E)]} = B[DiA(E), w(E)] + Bal), Diu(S)] 


for any functions h(€) and u(€). Since B(a;, a”) = 6 by definition of the dual 
basis, 


0 = B(Dja;,a”) + B (aj, Dja”) = 3, B (aq, a”) + AP, B (a;,a*) 


— Pa P sk _ pp P 
= 126? + Ab bf = Th + AP. 


as announced. Hence the final formula 


(3.16) Vilog = Diloe — reeee - Ty,T pi pleas 

where D; = d/dé'. These are the famous Christoffel formulas (1869), which 
generalize in an obvious way to tensor fields of arbitrary type. 

Exercise. Are the I x components of a tensor field? 

Index calculators did not leave it at that for their aim was differential 
geometry on “non-Euclidean curved” spaces, and in particular on vector 
spaces where distances are computed by non-Euclidean formulas. Using their 
terminology, and as it has already been alluded to in n° 1, (ii), suppose that 
at every point of E there is simple formula for calculating the length ds of 
the vector connecting some x € F£ to an infinitesimally near point x + dz. 
Since in an infinitesimal neighbourhood of « we want the geometry of the 
space to be, as a first approximation, like that of an Euclidean space, we take 
the square ds? to be a quadratic form 


(3.17) ds* = gi; (E)dé'de 


in the coordinates dé‘ of dx, with a function g;; = g;; depending on the point 
considered; the right hand side must obviously always be > 0 for dx # 0, in 
other words 


(3.18) giy(€)hiRI > 0 
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for any non-zero scalars h*. As all this being irrelevant if ds? depends on the 
chosen coordinate system, the g;;(€) must be the components in the latter of 
a tensor field of type (2,0) on F and given by 


(3.19) g(a; h,k) = gi(E)h'k? if h=h'a,(6), k = ka; (6), 


which implies transformation formulas for the g;; similar to (10) when chang- 
ing curvilinear coordinates; (19) can be interpreted as a Hilbert scalar product 
depending on x and applying to vectors with starting point 2. 

In the simplest case, namely that of traditional Euclidean geometry, let 
us once for all fix a scalar product denoted by (|); then g(x;h,k) = (h|k) 
for ,h,k € E; in the local chart (U,), the components of this tensor field 
at the point x = f(&) are the functions 


(3.20) gis(€) = (a(©)lay(6) 
and ds? is given by the formula 
ds? = (a;(€)dé*|a;(€)dé?) = giz (E)de* de? . 


The Christoffel symbols rk can then be calculated by using the g;;. To do 
this, differentiate gj, with respect to €’. Writing a;(€) as a; for short, 


Digjn = (Diajlax) + (a;|Diax) = Ff (aplax) + Dh, (ajlap) = 
= I 9pk + Ui Gp3 = Tigh + Ting 


where we set 

(3.21) Dijk = Opel ty = Tjix - 

An an aside, note that this calculation says that the covariant derivative 
Vigik = Diggin — Lin Gip — Ti 9pk 


of the tensor field g is zero; this comes as little surprise since 


g@hk)= 


d 
He +th;k,l) = a (kl) for t=0 


is the derivative of a function independent of t. 
Let us then write the relations 


Dijk + Ling = Diggk, Ujni + Ujik = Dy gin, Peig + Lagi = Dregs ; 


by adding the first two and subtracting the third, by (21), 


iL 
(3.22) lige = 5 (Digjr + Djgik — Dregiz) 


§ 1. Classical Differential Calculus 165 


immediately follows. This is a well-known formula allowing the Ij;, to be 
calculated in terms of the derivatives of the g;; ; by (21), we go from this to 
the I + by inverting the matrix with entries g;;. 

Ricci and his assistant Levi-Civita (who was later to be one of the big 
specialists of fluid mechanics) had explained this and many other things in 
the more subtle case of arbitrary ds”, in particular in a long memoir in French 
published in the famous German journal (Méthodes du calcul différentiel ab- 
solu, Math. Annalen, 1901) — an excellent example of scientific international- 
ism at a time when saber rattlers, as they were called, held sway everywhere. 
Albert Einstein had to assimilate everything to build his general relativity 
about a decade later; not being as great a virtuoso of tensor calculus as the 
Italians, he encountered some problems with them.?? The subsequent devel- 
opment of Riemannian geometry has made it possible to get rid of a large 
part of these coordinate calculations from the theory — though this has not 
made it easier to understand... —, but when he wanted to deduce that light 
rays were deviated in the neighbourhood of an intense gravitational field, 
Einstein was obliged to make everything explicit to obtain numerical results, 
verifiable by experiments, a task that mathematicians are generally exempted 
from. Finally, as in many other fields, the modern theory arrived after the 
“profusion of indices” which, for want of better, can still serve as exercises 
for the reader and from which Dieudonné himself did not fully escape at the 
end of vol. 3 of his Eléments d’analyse. 


2 For a history of the subject, see Karin Reich, Die Entwicklung des Tensorkalkiils. 
Vom absoluten Differentialkalkiil zur Relativitdtstheorie (Birkhauser, 1989); a 
chapter on the subject can also be found in T. Hawkins, Emergence of the Theory 
of Lie Groups. An Essay in the History of Mathematics 1869-1926 (Springer- 
Verlag, 2000), which covers a far wider field. I forego citing modern presentations 
of Riemmanian geometry; there are too many and they do not throw any light 
on the history of the subject. 
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§ 2. Differential Forms of Degree 1 


4 — Differential Forms of Degree 1 


As seen in Chapter VIII, finding a primitive F' of a holomorphic function f 
on an open set U C C = R? amounts to constructing a C function in U such 
that dF = f(z)dz = f(z)dv + if(z)dy, in other words such that 


DF=f, DoF=if. 
A somewhat more general problem is to write as dF’ an expression like 


(4.1) w = p(x, y)dx + q(x, y)dy, 


ie. a differential form of degree 1 on U, where p and q, its coefficients, are 
given functions; the coefficients are assumed to be always continuous and in 
order to avoid serious complications, it is better to assume they are of class 
C? at least on U. 

The aim is, therefore, to find C! functions F on U satisfying 


(4.2) Di\F=p, DoF =q. 


If such a primitive F of w exists in U, w is said to be an exact diffferential ; 
if U is connected, F' is unique up to an additive constant (Chap. III, n° 21, 
consequence of the mean value theorem for several variables). 

If p and q are C!, in which case F is C?, the relation D,D,F = D,D2F 
requires 


(4.3) Diq = D2p; 


if this necessary but not always sufficient condition is satisfied, pdx + qdy is 
said to be a closed differential. This terminology is inherited from algebraic 
topology and “Stokes” type integration formulas. In the holomorphic case 
(p = f,q =7f), (3) is just Cauchy’s holomorphic condition. 

Another case: it was observed in Chap. VII, n° 24 that if H is a real 
harmonic function on U, finding a holomorphic function with real part H 
amounts to finding a primitive of the holomorphic function D,H — iD2H, 
hence of the differential form 


(4.4)  (D,H —iD2H) (dx + idy) = dH — i(D2Hdx — D,Hdy) , 


hence of the differential form D2Hdx — D, Hdy; the latter is closed since, by 
assumption, AH = 0, where A = D,D, + D2D2z2 is the de Laplace operator 
d? /dx? + d?/dy?. 

There are similar problems in arbitrary dimension n. A differential form 
(of degree 1) on an open subset U of an n-dimensional space E’ with chosen 
basis (a;) is for the moment a purely symbolic expression 


§ 2. Differential Forms of Degree 1 167 
(4.5) w= pi(x)dx"...+pn(x)dx” = pj,dz', 


where the p; are given possibly complex or even vector-valued functions of 
class C! at least on U, and where the x* are the coordinates of x € U with 
respect to the given basis of £; this definition has no “absolute” meaning if 
the changes undergone by the p; under a change of basis is not made clear, 
but a direct definition of differential forms can be given, which gets rid of 
this problem. 

Indeed, if there a function F of class at least C1! on U, its differential is 
a function dF(#;h) of a point « € U and of a vector h € E. For fixed a, 
this function is linear in h. The natural generalization is therefore to consider 
functions w(x; h) subject to the same conditions, in other words tensor fields 
of type (0,1) or — what amounts to the same since a linear functional 


w(x): h + w(a;h) 


on E, i.e. an element of E* is associated to each x € U — maps from U to E* 
(or to E% since complex valued forms are also considered). Conversely, once 
a basis for FE is chosen, thanks to this definition, any map w: U —> E* can 
be written in the form (5). To see this, first write that 


(4.6) w(x; h) = w (2; h'a;) = h'w (x;a;) = pi(z)h* where p;(x) = w(x; a:). 


The expression p;(x)dzx' is then justified by precisely the same reasons as 
those given at the end of section (i) of n° 2: write that h’ = dzx*(a;h). 
Conversely, it suffices to associate the function 


(4.7) w(a;h)=p(r)h' («x €U, hE E) 


to (5) to reduce to the new definition. 

As in n° 3, (ii), w could be written in an arbitrary local chart (U,y). The 
latter defines a basis (a;(€)) of F at each x € U, where € = y(x); as seen in 
(3.4), any vector h € E can then be written 


h = h'(€)ai(€) = ai(€)dé" (a; h) 


with coordinates depending both on the chart and the point x considered. 
Hence 


w(x; h) = pi()h'(€) = pi(E)dé'(2; h) , 


where the p;(x) = w[x;a;(€)] are the components of the tensor field w in 
the chart considered; so, following Leibniz, the expression for w in the chart 


(U, ¢) is 
(4.8) w(a;dx) = pide’. 


Under chart change, the coefficients p;(€) are transformed like those of a 
tensor of type (0,1): 
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(4.9) Po(n) = po(n)pil€) ; 


where the derivative p',(1) = dé’ /dn® is calculated at the point (a). We only 
mostly argue in terms of standard Cartesian coordinate, but using arbitrary 
local chart cannot be avoided when extending the theory to differentiable 
manifolds. 

Finally note the similarity — and not the identity — between vector fields 
and differential forms: a vector field on an open set U C E associates to 
each « € U a vector of E (i.e. is a map from U to E), whereas a differential 
form associates to each « € U a covector of E (i.e. is a map from U to the 
possibly complex dual E* of E). This is the difference between tensor fields 
of type (1,0) and (0,1). 


5 — Local Primitives 


(i) Existence: calculations in terms of coordinates. Having said that, and 
considering a Cartesian space E equipped with a basis (a;), let us return to 
the search for a primitive F of a differential form w = p;(x)dz’ of class C1. 
We will consider an open connected subset G, i.e. a domain, as the general 
case can be reduced to it in an obvious way. In all that follows, x* will denoted 
the coordinates of a point or a vector with respect to the basis taken and we 
set D; = d/dz’ as usual. 
Relations 


(5.1) D,F=p;, for 1<i<n 
obviously require 
(5.2) D;pi — Dip; =0 


for all i and j, which in reality gives n(n — 1)/2 independent relations, for 
example those for which i > 7; if these necessary conditions hold, w is said 
to be closed. 

Exercise. Show that this definition is independent of the choice of the 
basis (a;). Does relation (2) hold in an arbitrary local chart ? 

In the case of holomorphic functions, we know there are always local prim- 
itives; this remains true in the general case, a result already known to Euler 
for two variable forms. It can be proved as in Chapter VIII, n° 2, (i), for- 
mula (2), though not quite so easily. 

First, let F be a C? function defined at least on the ball B : ||z|| < R 
centered at O in FE [to argue in the neighbourhood of an arbitrary point a, 
consider F(a + 2)]. For given « € B, the function t 4 F(ta) is defined on 
an open subset of R containing J = [0,1]. Setting D = d/dt, the multivariate 
chain rule shows that 


(5.3) D{F(ta)} = D;F(tz).2*. 
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The FT then shows that 
1 
(5.4) F(x) = F(0)+ | D;F(tx).x'dt. 
0 
Hence, if a closed differential w = p;(x)dx* with coefficients of class Ct — 
or even O°, but the following arguments fall apart in this case — admits a 


primitive F' on the ball B where it is defined, then for all x € B, up to an 
additive constant 


(5.5) Fa) = | pi(teyatat = | w(ta; x)dt 


necessarily. D;F = p; for all 7 remains to be check that. For this, the right 
hand side of (5) needs to be differentiated with respect to 27. Now, the p; 
being C', the function de (t,x) € I x G being integrated has continuous first 
order partial derivatives with respect to the x’, and so 


(5.6) D;F(«) = i D; {p;(tx)x"} dt = 
0 0 


1 1 
= i Djp;(ta).ta'dt +f pi(tx)d;dt - 
0 0 


1 1 
=) Djpi(tx).tx'dt +f p;(tx)dt , 
0 0 


where 6; = Dj. The Kronecker delta already mentioned in (1.8), is equal 
to 1 or 0 according to whether i and j are equal or not. But, by (2) and by 
(2.20) applied to p,; 


(5.7) D;pi(tx).ta’ = Dip; (tx).tz' = tD {p;(tx)} . 


As D = d/dt, integration by parts is possible, and so 


1 


D;F (a) =tp;(tx)} — | pj (tx)dt +f p;(tx)dt = p;(x), 


0 


which solves the problem. 

This calculation applies to any star-shaped open set G, i.e. in which there 
is some a € G such that, for all x € G, the line segment [a, x] is contained in 
G. In astar-shaped open subset? of a Cartesian space, all closed differential 
forms of class C! have a primitive, i.e. are exact. In particular, in a star 
domain G C C, any real harmonic function is the real part of a holomorphic 
function on G — a result already shown in Chapter VIII —, namely 


3 or simply connected one, as will be shown later. 
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(5.8) f(z) = A(2,y) - if [DoH (ta, ty)x — D, H (ta, ty)y] dt 


assuming G' is star-shaped with respect to the point a = 0: it suffices to apply 
(4) to the form (4.4). 

Exercise. Check directly that function (8) is holomorphic by giving an 
explicit proof of the general theorem. 


(ii) Existence of local primitives: intrinsic formulas. We show how to 
obtain the previous result without using coordinates. This method is far less 
stupid than the first one, though it is only a camouflage, but it has the 
advantage of holding for differential forms on Banach spaces.?+ Whether it is 
easier to understand than the former one is left to the reader to judge. 

Let us start again with a closed differential form w of class C! on an open 
set G star-shaped with respect to the point 0 and suppose that it admits a 
primitive F on G with F(0) = 0. For all x € G, the derivative of tH F (tx) 
at the point t is the derivative of s > F[(t + s)a] = F(tx+ sx) at s =0; it 
is, therefore, dF’ (ta; x) = w(ta;x). Then, as above, the FT shows that 


(5.9) F(x) =| w(ta;x)dt. 


So it all amounts to verifying that, if w is closed, the formula does indeed 
define a function such that dF = w. 

Since dF'(a;h) is the derivative of s + F(a+ sh) at s = 0, dF(a2;h) is 
obtained by taking s = 0 in the following calculation : 
d 1 
ds Jo 
d 1 


d 1 
=— |] uw(ta+tsh;x)dt+ =f w(tx + tsh;h)dt, 
ds 0 ds 0 


(5.10) dF(a#;h) = w(ta +tsh;x + sh)dt = 


where linearity of h +> w(y;h) for given y has been used. Differentiating 
under the f sign raises no difficulty. To conveniently formulate the result, 
the covariant derivative [see (3.11)] 


(5.11) w! (23h, k) = ae + sh;k) 


= D;p;(x)h'k? 
ds 1p; (2) 


i 


proves useful if w = p;(x)dx’. Having done that, let us return to the last two 
integrals of (10) and differentiate under the [ sign; taking (11) into account 
and setting s = 0 in the result,?° 


1 1 
(5.12) | w(txsh,aytae + | w(ta; h)dt. 
0 0 


4 See Henri Cartan, Calcul différentiel (Hermann). 
25 The derivative of a function of the form sf(s) at s = 0 is equal to f(0). 
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Integrating by parts the second term, it becomes 


i il 4 1 
(tame — f a wees A) dt = ash) — f pee 


But 


d 
—w(tx;h) = qa ite + sax; h)|._-o =u (fain, h), 
; = 


so that finally, 
1 
(5.13) dF (a; h) = w(ax;h) +f [w’ (ta; h, x) — w' (ta; x, h)| dt. 
0 


This involves what is called the exterior derivative 
(5.14) du(a;h,k) = w'(a;h,k) — w'(a;k, h) 


of the form w; for given «, it is an alternating or antisymmetric bilinear form 
of the vectors h,k. Without going further into this topic which we will return 
to later, we find that the formula 


(5.13’) dF (x;h) = w(a;h) — [ dw (ta; x, h)dt 


holds without any assumptions on w. On the other hand, setting 
w(x;h) = pi(x)h*, 
it is easy to first calculate 
ww! (ae hi, k) = dpa h)k? = Dip; (x)h'k! ; 
w' (2k, kh) = dpa; k)h? = Dypi(a)h'k’ , 
and then 
(5.15) dw(a;h,k) = (Dip; — Djpi) h'k . 
This formula shows that 
(5.16) w is closed = > dw =0. 
Relation (13’) then reduces to 
dF (x; h) = w(x;h), 


ending the proof. 
Relation 


(Gi, k) Swe (ahh) 


is still well-defined in Banach spaces where calculations in terms of coordi- 
nates is no longer possible; it then serves as a definition for closed forms. 

Exercise. Suppose w = dF’; calculate w'(x;h,k) directly, without coordi- 
nates, and show that (16) reduces to the formula d?/dsdt = d?/dtds. 


172 IX — Multivariate Differential and Integral Calculus 
6 — Integration Along a Path. Inverse Images 


(i) Integrals of a differential form. Everything that has been said in Chapter 
VIII about integrals of holomorphic functions generalizes, with some small 
adjustments, to differential forms. So we will remain brief. 

Consider a differential form w = p;(x)dx' of degree 1 on a domain G 
in a finite-dimensional vector space E and assume it has a primitive F' on 
G. Connect a € G to b € G by a path pw: [0,1] = I —> G of class C? 
or, more generally, an admissible path or of class C'/?, i.e. such that the 
coordinates ju'(t) of y(t) are primitives of regulated functions. Writing D 
for the differentiation operator with respect to t, the multivariate chain rule 
(2.20) shows that outside a countable set of values of t where p(t) is not 
differentiable (i.e. has different right and left derivatives), 


(6.1) D{F [u(t)]} = dF (u(t); u'()] = DiF [n(t)] Dut) = vi [u()] Det), 


where p; = D;F. The result is a regulated function of ¢ since the p;|ju(t)] are 
continuous and the Dy’(t) are regulated; hence (FT) 


(62) FO -Fla)= | nu] Du a= fw lutte Oat, 


The right hand side of (2), which is well-defined for any form w, is by definition 
the integral of w along , denoted by any one of the following 


(6.3) [on [rea = fotne.wolae= fr uolate 


with, finally, Stieltjes integrals as in Chapter VIII, n° 2, (iii). 

As before, these calculations show that a closed form of class C! has a 
primitive on G if and only if its integral along a path does not depend on its 
endpoints or, equivalently, that its integral along any closed path is zero. In 
fact, the result can be generalized to all forms of class C° in the following 
manner. 

Assume that the stated condition holds; then for any a,b € G, we can 


write 
b 
/ e 
a 


for the integral of w along any path connecting a to b in G; there is no 
ambiguity. The traditional formula (Chap. V, §3, n° 12) 


b c b 
L-Lel 
a a c 
continues to hold since it is possible to go from a to b through c... Having 
said this, set 
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where a € G is chosen once for all. Then, for given x and sufficiently small h, 


ath 1 : 1 
P(e+h)- Fa) = f u= we ths hat =n" f pi(x + th)dt 
x 0) 0 


follows. This can be seen by integrating along the line segment [2,2 +h] ina 
disc centered at x contained in G. But as w is of class C°, for given x and for 
any r > 0, there is r’ > 0 such that ||h|| <r’ implies |w(a+th; h) —w(ax;h)| < 
r||h|| for all t € [0,1], and so 


F(a +h) — F(a) = w(ax;h) + o(h) 


and dF (a;h) = w(a;h), qed. In conclusion : 


Theorem 2. For any differential form w of degree 1 and class C® on a 
domain G, the following conditions are equivalent: 


(i) w ts an exact differential ; 

(ii) the integral of w along any admissible path in G does not depend on its 
endpoints ; 

(iii) the integral of w along any closed admissible path in G is zero. 


The comments of Chapter VIII, n° 2, (ii) on changes of parametrization 
for a path, “opposite” paths, additivity of the integral when two paths are 
adjoined, etc. can be made to apply without any changes to the general case. 


(ii) Inverse image of a differential form. It is more useful to observe that 
formula (3) defining a curvilinear integral suggests an operation on differ- 
ential forms generalizing the composition of maps and which will be later 
generalized to forms of arbitrary degree. For this, consider the open subsets 
U and V in the Cartesian spaces & and F and a map f : U —> V. It asso- 
ciates to each function gq on V, a composite function p = qo f on U, given 
by 


If qg and f are of class C', in which case the same holds for p, the differentials 
w = dp and w = dq of p andt q are connected by the multivariate chain rule, 
which becomes here 


(6.4) w(a;h) = w[f(x); f"(@)al . 


Now, the right hand side of (4) is well-defined for any form @ on V and, 
for given x, is a linear map of f’(x)h and hence of h. So by relation (4), a 
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differential form w on U can be deduced from @ and f, the inverse image?® 
of w under f, which could also be called the composition of w and f; the 
nature of this operation is imposed by the necessity of making it well-defined. 
Some authors denote it by f*(@) or ‘f(@); this barbarian notation presents 
occasional advantages, but no one has ever denoted the composite qo f of 
two functions by f*q or *f(q). So we will denote the inverse image by the 
notation 


(6.5) wo f:(a;h)—> w[f(a); f'(x)h] 
which, in Leibniz style, can be written 
(6.6) wo f(x;dx) = w[f(x);df(x)] = a [f(2); f'(w)da] 


in a similar way as the differentiation theorem of composite functions; con- 
versely, it can now be written as 


(6.7) dgo f =d(gof). 


A curvilinear integral can then be interpreted in the following manner. 
Thanks to the path pw : I — E, w is transformed into a differential form 
wo p on I, given more explicitly by 


wo p(t; dt) = w [u(t); w’(t)] dt. 


This is precisely the expression integrated over J in order to integrate w 
along py. In dimension one, any differential form can be written p(t)dt with 
a function p(t), and the extended integral of p over I is just the integral of 
the form p(t)dt along the rather commonplace path t +> t. Hence, in Leibniz 
style, we get the formula 


(6.8) [ e@ae) = fe euticas, 


Now consider three open subsets U, V and W in the Cartesian spaces 
E, F and G and two maps f : U — V and g: V —> W. So there is a 
composite map go f : U —>+ W. Given a form w on W, a form wo g on V, 
then a form (wo g)o f on U can be successively defined or, a form wo (go f) 
on U can be directly defined. Like in set theory, in this case, 


(hog)of=ho(gof) 
trivially holds. We have here an associativity formula 


(6.9) (wog)of=wo(gof). 


26 The “direct” image would consist in deducing from f and from a form w on U 
a form @ on V. This is impossible to do if f is not a diffeomorphism. 
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Indeed, start with some x € U. The right hand side can be obtained at this 
point from w(z;dz) by replacing z by go f(x) = g[f(x)] and dz by 


d(go f)(a; da) = dg (f(x); f"(x)da] 


in it. Therefore, the right hand side can be calculated by the following 
operations: first replace z by g(y) and dz by dg(y;dy), which replaces w 
by wo g(y;dy), then replace y by f(a) and dy by f’(a)dx, which replaces 
wo g(ysdy) by (wo g)o f(a:dz), giving (9). 

In particular, we may suppose that?’ U = I, so that f is a path p in V 
and go wa path in W. Integrating, (8) and (9) immediately show that 


(6.10) [ e= fees. 


If V is also an interval in R, we recover the fact that, within reasonable 
conditions, the value of a curvilinear integral is independent of the chosen 
parametrization. 


7 — Effect of a Homotopy on an Integral 


(i) Differentiation with respect to a path. Like in the very particular case of 
holomorphic functions discussed in the previous chapter, the fundamental 
property of the integral of a closed (but not necessarily exact) differential 
form consists in not changing when deforming the path of integration with- 
out changing its endpoints, i.e. by a fixed-endpoint homotopy. Here too, the 
arguments of Chapter VIII provide the method and the results. 

Before extending what has been proved in Chapter VIII to differential 
forms, the differentiation formula with respect to the integration path of 
Chapter VIII, n° 3, (ii) must first be generalized. In other words, consider an 
admissible path ys : J = [0,1] —> G in the open subset G of the Cartesian 
space £, and an admissible path vy : J —> E in E and differentiate the 
expression 


(7.1) F(wt+sv) = iy 


2 fale + sult + 90/0) a 


with respect to s, where s is a parameter varying in a sufficiently small interval 
around 0. But as shorthand for (1), write 


(e107) Put sv) = folursiin))dt+s f w(u+ svi) dt 


= [wut sridu) ts f w(u+ svidv) 


or in coordinates, 


27 The fact that J is not open is not important. 
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(7.1”) F(wt sv) = [rte + sv)dy' +s fixe + sv)dv" 


where the components of the “vector” measures dw and dv, dy'(t) = 
Du'(t)dt and dv'(t) = Dv*(t)dt are the Radon measures. As in Chapter VIII, 
n° 3, (ii), to justify differentiation under the [ sign, it suffices to check that 


(a) the functions 


(7.2’) (s,t) > pi [u(t) + sv()] 


or, equivalently, 


(7.2”) (s, t) +> w [u(t) + sv(t); h] 


are continuous for given h, 
(b) their derivatives with respect to s exist and are continuous functions of 
(s, t). 

The first statement is obvious. Differentiability with respect to s is equally 
obvious since C! functions are being composed with a linear function of s. As 
for the derivative of (2”) with respect to s, by definition (5.11) of a covariant 
derivative, it is w’[y(t) + sv(t); V(t), Aj. 

Continuing to write y,v,... instead of p(t), v(t),..., 


d 
(7.3) qt e+ 8¥) = fu (us svivypl)at + fw (+ suse!) dt+ 
Ss 


+s fw! (ut sv;v,v') dt = 
= fol ut sina! tov) dts fo (u sviv')at 


thus follow. To imitate the calculations of n° 5 or better those of Chapter 
VIII, n° 3, we now need to apply the integration by parts formula to the last 
integral. For given x, the expression w(;h) being a linear function of h, it is 
identical to its differential with respect to h, so that 


d I 
Sw [as FO] =o fei FO) 


for any vector-valued function f(t). Given this result, the multivariate chain 
rule and (5.11) show that 


d 
qe (H+ su; ¥) =w (utsyjp't+sy',v)+w(utsyjv’). 
As a result, the last integral obtained in (3) can also be written 


t=1 


= fel (ut svat + av) de. 


[our sui )dt =o (ut sir) 
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Substituting this result in (3), it follows that 


d 1 
ruts) = [fel et snsrsyl tov) (ut snip + ov vy + 
0 


t=1 
+ w(p + sv;v) Rig 


Hence, using the exterior derivative 
dus(x;h, k) = w' (x; h, k) — w' (a2; k, h) 
introduced in (5.14), we get 


d t=1 
(7.4) aot + sv) =w(u+sv;v) 


i 

+f dw (u+ sv;v, pu + sv’) dt. 

t=0 0 

If the form w is closed, then dw = 0 as already seen in n° 5 and (4) follows 
from the relation 


t=1 


(7.4" £ F(u + sv) =u lult) + su(0):r0) 


t=0 


which generalizes formula (3.5) of Chapter VIII. 


(ii) Effect of a homotopy on an integral. The same consequence as in the 
previous chapter follow from (4’). If in the open set G where w is defined, 
there are two sufficiently near admissible paths fg and yu; for the path 


tr (1— s)uo(t) + syi(t) = wo(t) + 8 [Hi (t) — po(t)] 


to be in G for all s € [0,1], then formula (4’) applies for 4 = po and v = 
[41 — flo. The function F'(u + sv) is, therefore, constant, in both usual cases: 
Lo and js; have the same endpoints or else are both closed. In the general 
case of two homotopic paths, as in Chapter VIII, replace the given homotopy 
by a succession of linear homotopies. This gives the result: 


Theorem 3. Let G be a domain in a Cartesian space E and w a closed 
differential form of class C' on G. The integrals of w along the two admissible 
paths to and jy inG are equal if one of the following conditions holds : 


(a) There is a fixed-endpoint homotopy from po to 1 onG; 
(b) po and py are closed homotopic as closed paths in G. 


From this, we can deduce that if all closed paths in G are homotopic to 
a constant path, then any closed form on G has a primitive; a domain with 
this property is said to be stmply connected. 

Homotopy being an equivalence relation, it is then obvious that condition 
(b) of the theorem always holds. The same is true for condition (a); to see 
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this, start with the closed path v9 = fo — py. Assuming pg and py, to be 
parameterized by the interval [0,1], it can be defined by the formulas 


t+ po(t) if O<¢t<1, t-4yi(2—-2) if 1<¢<2, 
where the parameter ¢ now varies in [0,2]. Let 


a: [0,1] x [0,2) —G 


Fig. 7.1. 


be a deformation of vp at all times during which the path remains closed, 
and, for s = 1, reduces to a point c. Adjoining to each intermediary path v, 
the paths followed by the endpoints of jp (or 41), vs can be replaced by the 
difference between two paths with the same endpoints as fuo and 1; if we 
insist on formulas, the first one (which must reduce to fo for s = 0) may be 
defined by 


a(3su, 0) for 0<u<1/3, 
(7.5’) ur > a(s,3u—1) for 1/3 <u< 2/3, 
o[3s(1—u),1] for 1/3<u<1; 


when t varies from 0 to 1/3, the corresponding point describes the trajectory 
followed by the starting point common to both initial paths between “time” 
s = 0 and time s. As u varies over the interval [1/3,2/3], the point with 
parameter u traces out the arc 0 < t < 1 of v,(t) arising from the deformation 
of 19 . Finally, in the interval [2/3, 1], we follow (in the reverse direction) the 
trajectory of the endpoint common to sug and py. Clearly, the paths thus 
defined are fixed-endpoint homotopic to po for all s. Slightly modifying the 
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previous formulas in order to obtain for 1/3 < u < 2/3, the arc 1 < t < 2 of 
v,(t) arising from the deformation of py: 


a(3su, 2) for 0<u<1/3, 
(7.5”) ur > o[s,3(1—u)] for 1/3 <u< 2/3, 
o[3s(1—u),1] for 2/3<u<1, 


define a fixed endpoint-homotopic path to 4,. When s = 1, by assumption, 
the path v, reduces to a point c. Up to parametrization, the paths (5’) and 
(5”) are then clearly identical. As they are (fixed-endpoint) homotopic to 
respectively j1o and (41, the same also holds for these. 

Homotopy theory includes many small calculations of this type. After 
having done them once or twice, diagrams are substituted for them, to which 
the reader is in turn free to substitute his small calculations. 

Any star domain G with respect to a point a is clearly simply connected: 
homotheties with centre a and ratio s € [0,1] deform closed paths into a 
unique point. More generally, the same holds for any contractible domain G 
in the sense of the previous chapter. The main difference with the case of 
domains in C is that in R”, a simply connected domain is not necessarily 
contractible (counterexample: the open subset bounded by two concentric 
spheres). 


(iii) The Banach space C\/?(I; E). As in n° 3, (ii) of Chapter VIII, the 
computation (3) of the derivative of F'(j+ sv) can be interpreted in terms of 
differential calculus in a Banach space.?® The vector space C'!/?(I, E) of C1/? 
maps w: 1 —> E, where E is the ambient Cartesian space, can be equipped 
with the norm 


lel = Weel + We’ Iz 


inspired from distribution theory (Chap. V, § 10), or maybe it is the other way 
round ; Ceti, E) then becomes a complete space, i.e. a Banach space: the 
proof is the same as in Chapter VIII. If C!/?(J; G) is the set of all admissible 
paths in the given open subset G of E, then C!/?(I;G) is clearly an open 
subset of C!/?(I, E), ie. for any p € C(I;G), any path v € C(I, E) with 
sufficiently small | — n], or merely sufficiently small ||“ — v||;, is also in 
CVr(ec), 

The function F(u) = f wl )|dt, defined on the open set C'/?(I,G), 
is everywhere ee on : a i let pw, v be two elements in this open 
set. To find an upper bound for |F'(v) — F'(js)| for given ps and v near ju, it is 
necessary to search for a uniform upper bound of 


w [v(t); v'(t)] — w (u(t); w(t] = pi [v@)] Dv" (t) — pi [H()] De") 


if w = p;(x)dx'. The problem is similar to that of the proof of the continuity of 
a product: the uniform continuity of the p;(a) in a compact neighbourhood 


28 To understand the rest of the book, reading this n° is not essential. 
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of u(Z) shows that |p;[v(t)] — pi[u(t)]| is everywhere < r if ||v — pllr <7’, 
the differences |Dv*(t) — Du'(t)| being bounded above by ||v’/ — p’||7. Usual 
calculations then lead to the result. 


Having done this, note that the notion of a differential introduced in n° 2, 
(i) for functions defined on an open subset of a Cartesian space generalizes 
automatically to the case of a numerical-valued function F' defined on an open 
subset of a Banach space B or even with values in another Banach space H; 
F will be said to be differentiable at a point x if there is a continuous linear 
map? u from B to H such that, for sufficiently small h € B, 


(7.6) F(a +h) = F(x) +u(h) + o(h) 


with, as usual, lim |lo(h)||/||A|] = 0 as ||h|| tends to 0; then set dF(a;h) = 
u(h). With this definition and taking s = 0 in formula (4’), it is tempting to 
write that 


(7.7) dF (p;v) = w [u(1);(1)] — w [u(0); ¥(0)] . 


This expression is indeed linear in v for given yz. To justify it, it still remains 
to be shown that 


F(w+v) = F(u) + dF (uv) + ov) 


as the norm of the path v € C!/?(I, E) tends to 0. Now, (4’) and the FT 
show that, if w is closed, then 


(7.8) F(u+v)— F(u) = 


ae {w [#(1) + sv(1); (1)] — w[u(0) + sv(0); »(0)] } ds, 


at least if |v| is sufficiently small so that + sv € C!/?(I,G) for any s € I. 
But as w(x;h) is a C! function of (x, h) and is linear in h, there is an equality 
of the form 


Ju(a + kh) — w(x; h)| < M|kK-[Al| 


which holds for given x, for any h and sufficient small |||]. The error made by 
replacing s by 0 in the functions integrated in (8) is, up to a constant factor, 
bounded above by ||v(1)||? in the first case and ||v(0)||? in the second and 
hence, up to a constant, the total error made is inferior to ae But replacing 
s by 0 in the integrals (8), the right hand side is replaced by expression (7). 


9 as already remarked, this is a superfluous assumption. On analysis in Banach 


spaces, see Dieudonné, Eléments d’Analyse, vol. 1, VIII, or Serge Lang, Analysis 
I, or Henri Cartan, Calcul différentiel, etc. 
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So finally, 
(7.9) Flv) ~ Flu) = [W(1)sv(0)] — w [n(0);(0)] +0 (WP) . 
This result is more than sufficient to justify (7). 
But (7) is based on formula (4’), ie. supposes that w is closed. If it is not 


the case, it is necessary to return to the formula (4) in its entirety. Taking 
s = 0 in it, it can be deduced that probably 


(7.10) dF(u;v) = ew (1);H(0)] ~ eo fa(0};(0)] + fae [u(t); v(t), u'(t)] dt. 


Like in the case of the simpler formula (7), we leave it to the reader to check 
this formula. 
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§ 3. Integration of Differential Forms 


8 — Exterior Derivative of a Form of Degree 1 


(i) Physicists’ Vector Analysis. The problem of primitives occurs in physics 
and in mechanics, but, apart from some theoreticians, physicists and mechan- 
ical engineers, being quite conservative revolutionaries, prefer the terminol- 
ogy inherited from the 19th century which they pass on from generation to 
generation. For them, a differential form Pdx + Qdy with two variables is a 
vector field, namely the function whose value at (x,y) is the vector of the 
coordinate plane P(x,y) and Q(x,y); usually its origin is taken to be the 
point (x,y) rather the origin of the coordinates. This explains the use of the 
word “field”, in the same way as we talk about a wheat field. The expression 
D,Q — D2P — a numerical or “scalar”-valued function — is the rotational of 
this vector field and for them, the primitive F’, when it exists, is the potential 
from which the given vector field is derived. They also express the relation 
dF = D,Fdz + D2Fdy by saying that the vector field with coordinates D, F’ 
and D2F is the gradient of the function F’. 

There is above all a vocabulary for 3-dimensional physics. Note first that 
the physical space of the Creator is not R®°; it can be identified with it only 
if an origin O, a unit length and for what follows, a rectangular coordinate 
system Ox, Oy and Oz, are chosen. As a real valued linear functional h ++ ¢;h* 
on a Cartesian space has exactly as many coordinates (its coefficients c;) as a 
vector with respect to a basis of it, these two types of objects often tend to be 
confused; the differential dF = Pdx + Qdy + Rdz of a function is, therefore, 
identified with a vector field 


grad F : (2,y,z) > (P(a,y,2),Q(2,y,2), R(w,y,2)) 


generally written Pi + Qj + Rk, where the letters 7, 7 and k, topped with 
arrows, a delight for typesetters, denote the “unit vectors”, i.e. basis vector, 
of the rectangular coordinate system chosen. 

Physicists exploit the fact that der Herr, as Einstein used to call him, has 
once for all defined the “scalar product” of two vectors of the physical space, 
provided, however, that a unit of length as well as an origin O are chosen as 
above in the space in order to transform it into a vector space. If the scalar 
product of two vectors is written 


(h|k) = h'kt + h?k? + bok? 


in rectangular coordinates, — mechanical engineers and physicists often write 
it h.k —, any linear functional h +> c,h! + ch? + csh® can then be written 
h +> (Alc), with a vector c = (c1,c2,c3) which it is entirely determined by; 
the linear functional in question can then be identified?° with the vector c. 


3° In mathematics, identifying two objects means that no difference is made be- 
tween them. It is often convenient to avoid useless complications (for example, 
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But as already mentioned in n° 1, (ii), this identification has no intrinsic or 
absolute meaning, whether in a general vector space lacking a distinguished 
scalar product, or in an Euclidean space, when non-rectangular coordinate 
systems are used. 

Moreover, defining the “gradient” of a function F' as we have done, i.e. 
by differentiating F(a + th) at t = 0, is perfectly natural even and especially 
from a physics viewpoint. For example what is a “temperature gradient” 
but a measure of the rate of change of temperature when going from a point 
x to a point x + th in the direction defined by a given vector h? There 
are no coordinates in this definition, and a “rate of change” has always 
been a derivative. In this case, the physical reason for writing dF (x;h) as 
(grad F'(x)|h) is to highlight the direction, that of the vector grad F(x), in 
which the temperature change in the neighbourhood of x is fastest. 

Remaining in dimension 3, there are three independent conditions (5.14’); 
as 3 is the unique number for which n(n — 1)/2 =n, the Creator must have 
invented this miracle on the eve of the Big Bang to mystify his creatures into 
believing that they would thus more easily understand his Complete Works 
than if he had, for example, chosen a four-dimensional space. This leads 
physicists to wrongly associate to the vector field (P,Q, R), the vector field 
(D2R — D3Q, D3P — DR, DQ — D2P), that they again call the rotational 
of the given vector field; its cancelation is a necessary (and sufficient in a 
simply connected domain) condition for the vector field (P,Q, R) to come 
from a potential (Theorem 3). But for n = 4, the Relativity space that the 
mystified eventually discovered, there are already 6 conditions (5.14’) and it 
is no longer a question of rotationals in the sense of a vector field. For n > 4 
this is even less so. 


” 


(ii) Differential forms of degree 2. The proper generalization of physicists’ 
rotational is the exterior derivative of a differential form, which have already 
cropped up in n° 5 at the end of the previous n° : 


du(x;h,k) = w! (a;h,k) — w'(a;k,h). 


In dimension 3, if w = Pdx + Qdy + Rdz, writing (5.15) explicitly, the ex- 
pression dw(x;h,k) is indeed equal to 

(D2R — D3Q) (h?k? — h?k?) + (D3P — D,R) (h?k* — hk?) 
+ (DiQ — D2P) (h'k? —h7k") , 


which gives rise to the components of the physicists’ rotational. 
As a function of h and k for a given x, dw(x;h,k) is an alternating bilinear 
form in h, k. Its generalization is a function w(#;h,k) of « € G and of two 


in Chapter II rational numbers were identified to real ones) when the objects 
identified satisfy for example the same computation rules. But this is highly 
questionable when for example it suppose a particular choice of a coordinate 
system. 
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varying vectors h,k € E, called a differential form of degree 2 and class C? on 
an open subset G of a Cartesian space FE, which, for given 2, is an alternating 
bilinear form*! 


w(a) : (h,k) +> w(a;h, k) 


in h,k and which, for given h,k is a C? function of x. The case p = 2 is 
sufficient for what follows. 

Let B(h,k) be an alternating bilinear form (a purely algebraic notion) 
and (a;) a basis for E. Since h = h'a; and k = kJa,;, 


B(h,k) = h'B (ai, k) = h'k! B (a;, a3) = bijgh'k? 
with antisymmetric coefficients 
bij = B(ai,a;) = —dji , 
that are, therefore zero for i = j. This can be also written 
-_ 1 a _ 
B(h, k) = bigh’k? = S~ bi; (h'k? — bik’) = 5 big (nik! — bik!) 
<j 


The factor 1/2 corrects the fact that each term is written twice. In this form, 
the antisymmetry is highlighted. 
Applying this calculation to B = w(x), we, therefore, get the relation 


(8.1) w(x; hk) = pij(a)h'k! = Y 7 pig (a) (hi! — nik) = 
t<J 
1 tj Jp 

with coefficients 

(8.2) pig (x) = w (a; a4, a5) = —pja(x) 


depending on the basis (a;) used. Note that in (1), we have been forced to 
abandon Einstein’s convention in the first sum. When w = d(p;dz'), pi; = 
Dip; — Djpi- 

In degree 2, expressions similar to p;(x)dz' can be used. For this, given 
two linear functionals u(h) = u;h’ and v(h) = v;h* on the vector space 
considered, the alternating bilinear form 


(8.3) uAv:(h,k) +> u(h)v(k) — u(k)u(h) = iv; (hék? — bik’) 
= (uyv; — Uj Vi) hik) 
is called the exterior product of u and v. This is a purely algebraic notion. 


3! See for example Cours d’Algébre by the author, §§21 to 24. A differential form 
of degree 2 is, therefore, an “antisymmetric” tensor field of type (0, 2). 
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Obviously, 
(8.4) u\v=—vAu 


and the product is an alternating bilinear form of u and v; in particular, 
u/ u = 0 always holds. 

Applying this definition to the forms u = dx’ and v = dz/, given by 
u(h) = h' and v(h) = hs, we get the form 


da* A dx) : (h,k) > da'(h)dai (k) — da'(k)dx’ (h) = h'k? — hik*. 
As a result, (1) can also be written 


w(a;h,k) = > pj;(x).da* A daI(h, k) 


i<j 


where dx’ A dx/(h,k) denotes the value of the alternating bilinear form dx A 
dx) at (h,k); hence the shorthand expression 


. ; 1 . . 
(8.5) w= Spiga’ A dx = spider A dx! 
<j 


involving both the product of the bilinear form dx’ A dz) and the scalar 
function p,;. In dimension 3, setting z,y,z for the three coordinates, we 
always write 


(8.5’) w = pdy \ dz + qdz \ da +rdx A dy. 
The exterior product of two forms w and @ of degree 1 is similarly defined: 


it is the form z 4 w(x) A w(x) of degree 2. If w = pdx’ and w = qdz’, 
clearly 


. %& 
who =pigqjyde’ \ dx) = . (pidy — DiGi) dz’ Ada? . 


For example, if f and g are two functions on an open subset of R?, and if s 
and t denote the standard coordinates, then®? 


D 
(8.6) df Adg =(D1f.D2g — D2f.Dyg) dt A dt, = Diet de, 


an expression involving the Jacobian of the map (s,t) + (f(s), 9(s,¢)). 
With these conventions, the exterior differential of a form w = p,dx' of 
degree 1 can be calculated by writing that 


2 Direct calculation: write that df \ dg = (Difds + Dofdt) \ (Digds + Dogdt), 
merely develop and take into account the relations ds\ds = dtA\dt = 0, dt\ds = 
—ds / dt. 
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dw(ax;h,k) = dp;(x;h)k’ — dp;(a;k)h' = 
= dp; (a; h)da" (k’) — dp; (a; k)da*(h) = 
= dp;(x) A dx'(h,k), 


a relation involving the value at (h,k) of the exterior product of the linear 
functionals h ++ dp;(x;h) and h+> dx*(h) = h'. 


(8.7) w = pdx" dw = dp; A dx’ 


for short. 
Exercise 1. Using the definition of dw, show that 


(8.8) d(fw) = df \w+ fdw 


if f is a function and w a form of degree 1. Deduce (7). 

In dimension 3 and in rectangular coordinates, but only in this case, (5’) 
can be identified to the vector field H(x) with coordinates p, q,1r, though the 
vector with coordinates 


h?k? — nik? , nek! — bik? , hk? — 2k 


is called the vector product (or, which is better, exterior product) of the 
vectors h and k, and is written h x k or hAk. Then 


w(2;h,k) = (H(x)|hAk), 


the scalar product of the vectors H(x) and h A k. This does not substitute 
for the theory of alternating bilinear forms. 


(iii) Forms of degree p. All this can be generalized*? and differential forms 
of arbitrary degree p as well as an operation d taking a form of degree p to 
a form of degree p+ 1 can be defined. A form of degree p is a tensor field of 
type (0, p), ie. a function 


w(xihy,...,Rp) 


multilinear in h; for given x, which is also alternating, i.e. multiplied by —1 
when two variables h are permuted. It can easily be shown that*4 w = 0 if p 
is greater than the dimension n of EF and that when p = n, 


w(a;hy,...,hn) = p(x) det (hi,..., An) , 


33 See for example Henri Cartan, Calcul différentiel. 

34 If p > n, the vectors hi,.. ., Ap are never linearly independent; hence one of 
them can be expressed as the linear combination of the others. Substituting in 
a p-linear alternating form, we get a linear combination of its values for vectors 
that are not pairwise distinct, hence that are zero because of the antisymmetry 
of the form considered. See Cours d’algébre, § 23, also for determinant theory. 
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where p(a) is a numerical function and det(hj,...,hn) is the determinant of 
the vectors h; (i.e. of the matrix of their coordinates) with respect to a basis 
of E. 

Exercise 2. How does the coefficient p change when the basis of EF with 
respect to which the determinant of n vectors is defined is changed? 

In the general case, choosing a basis for the space FE, w can be written as 


w (23 ha, ..., Rp) = @y..4, (a)hi} Rs he = 
1 


Ty Gin..-ip 


(ae) det (i,.205 ip) 
with antisymmetric coefficients 


Oi h(E Ol Tee ore dee,) 


These are the values of w(x) at the canonical basis vectors, and the upper 
indices bearing upon the determinants indicate the p rows that need to be 
extracted from the p x n matrix for the coordinates of the h;. The term 1/p! 
could be avoided by summing over the systems of strictly increasing indices. 
All this, except the variable x which does not play any role in this context, 
expresses the standard formulas of multilinear algebra and of the theory of 
determinants. It will henceforth be hardly needed since the the way forward 
is sufficiently indicated by forms of degree 1,2, and 3. 


In the general case, there is also an exterior differentiation operation tak- 
ing forms of degree p to forms of degree p+ 1. For example to understand 
what physicists call the “divergence” of a vector field, it is necessary to know 
how to associate a form dw of degree 3 to a form 


1 
Ww = Pisa’ A dx} = pogdx? A dx? + p3,dx° A dx! + pod! A dx? 


of degree 2 on R? (or on an arbitrary Cartesian space); its value at some 
point x € G is an alternating trilinear form, i.e. an antisymmetric function 
of three variable vectors h,k,! © E. To define it, we first need to introduce a 
covariant derivative 


d 
(8.9) w (a;h,k,l) = qe +th;k,l) for t=0 
1 a i 
= 3 dPis (a; h) (kil? - kjl) ; 
a linear expression in h,k,! and alternating in k,/. Then set 


(8.10) du(a;h,k,l) = w"(a;h, k,l) — w'(a;k,h,l) +" (asl, h,k) = 
= w'(a;h,k,l) + w' (ak, l,h) +0! (a; 1, h,k) 


1 - 
a 5 Pin (2) det*(h, k,l), 
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where det’/*(h, k, 1) is the determinant of order 3 consisting of the coordinates 
with indices 7, 7,k of the vectors h, k,l; we could get rid of the factor 1/2 by 
summing only over the 7,7,k such that 7 < k. We do the bare minimum to 
transform w’ into an alternating form. In particular, considering the form 


w = pdy \ dz + qdz \ dx + rdx A dy 
of degree 2 on R?, easily leads to 
du(a;h, k,l) = (Dip + Doq+ Dsr) .det(h, k,l), 


where det(h,k,1) is the determinant of the vectors h, k,l! with respect to the 
canonical basis. For physicists, the function D,;p+ D2q+ Ds3r is the divergence 
of the vector field (p,q,7) ; the determinant is the scalar triple product of the 
vectors h,k and 1, and is written 


(h, k,l) = (h|K AL) = ht (A708 — RPV?) +h? (R80 — BP) +h? (RIP — RPL’) , 


where k A I is the vector or exterior product of k and l. 

In the general case, start with a function w(x; h1,...,h,) which, for given 
x, is alternating multilinear with respect to the variables h;E and compute 
its covariant derivative 


d 
(8.10) ow’ (2; Paya; Ae) = pe (t + tho; ha,--- hp) for t=0, 


then set 


(8.12) dw (a hg,...,fy) = y (-—1)'w’ (aeons iy) ; 
O<i<p 


where the accent over the letter h; indicates it is omitted. the result is easily 
seen to be multilinear and alternating in ho,..., hp; any differential form that 
can be written as dw is said to be exact. 

Exterior differentiation satisfies some classical properties; proofs can be 
found everywhere, for example in Cartan, Calcul différentiel, but as the best 
way for understanding them is to recover them oneself, I will only state them 
as exercises. 

Exercise 3. For any form w of degree p, ddw = 0, in other words: any 
exact differential form is closed. Corollary: the divergence of a rotational is 
always zero. 

Exercise 4. Let w be a closed form (dw = 0) of degree 2 on a star domain 
with respect to 0. Define a form @ of degree 1 by setting 


(8.13) w(x; h) = fetter net, 


where integration is over [0, 1]. Show that dw = w. [Imitate calculation (5.6)]. 
For a form of degree p+ 1, set (Poincaré theorem) 
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(8.14) w(a;hi,...,Rp) = fe (ta;a,hy,..., hp) t?dt. 


These calculations show that locally, any differential closed form of degree p 
is exact, i.e. is the exterior derivative of a form a of degree p — 1; the latter 
is not unique: a closed form, i.e. as we are using local arguments, an exact 
differential can always be added to it. 

This argument still holds globally if G is, for example, star shaped, and 
hence convex. But the general case is far more difficult to deal with, even 
in a simply connected domain if forms of degree > 2 are considered; the 
problem is directly connected to the topology of G (De Rham cohomology); 
the following simple argument may give a vague idea. 

G can always be written as the finite or infinite union of open convex 
non-empty subsets U;; the intersections 


Uig =U; NU;, Uign =UiNUZ OUR, ete, 


remain convex (and possibly empty). Then take for example a closed form w 
of degree 3 on G, de degré 3. There are forms w; of degree 2 on U; such that 


w= du; on U;. 


As dw; = dw; on U;;, there are forms w;; of degree 1 on the non-empty Uj; 
such that 


Wi — Wi = dw; on Uy Fi 


Then d(wj~ — wiz + wij) = 0 on each U;;,, and so there are forms wij, of 
degree 0 (i.e functions) on the non-empty Uj;, such that 


Wik — Wik + Wig = dwijk on Uijr ‘ 

So 
A(Wikh — Wikh + Wijh — Wijk) =O on Uijzen, 

and as the non-empty Uj;zn are convex, the relations 

Wyikh — Wikh + Wizh — Wijk = Cijkh 
follow, where the cij~n, are constants associated to the non-empty Uj;zn and 
satisfying 

Cjkhl — Cikhl + Cighl — Cigkl + Cijkh = 9 


whenever Uj ;xn1 is non-empty. Thus the scheme of non-empty pairwise, three- 
wise, etc. intersections of the open subsets U; — he “simplicial structure” 
of the cover considered — is involved, and this takes us into the realm of 
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cohomology... See Andre Weil, Sur les théorémes de de Rham (Comm. Math. 
Helvetici, 1952, or Guuvres). 

Corollaries: In dimension 3, any vector field with zero divergence is locally 
the rotational of a vector field, unique up to a gradient, and every function 
is locally the divergence of a vector field, which is unique up to a rotational. 
For example in the second case, if the given function f is assumed to be 
on a star domain with respect to 0, then we find a vector field (p', p, p?) 
with divergence f by applying formula (14) for p = 2 to the differential form 
w= f(x)dx! A dx? A dx’, for which 


w(x;h, k,l) = f(x) det(h, k,l); 


since det(h, k, 1) is the “scalar triple product” (h, k,l) = (h| KAJ) of physicists, 
w(2;h,k) = (ic (alh A k) t?dt = (a|h Ak) / f(ta)t?dt, 
follows. This means that the vector field we sought is 
p(«)=F(a)x' with F(x y= fa tx)t7 dt , 


where integration is over (0, 1). 
Exercise 5. Defined the external product of two alternating multilinear 
forms f and g of degrees p and q by anti-symmetrizing their tensor product : 


(8.15) f Ag (ii, ++<; Rig) = dels (Histajaxexs Retes) 
xg ale ony hs(p+q)) ’ 


where summation is over all permutations s of {1,...,p+q} and where <(s) 
denotes the signature of s; the factor 1/p!q! could be omitted by summing 
only over the permutations such that s(1) <...< s(p) and s(p+1) <...< 
s(p +q). Show that 


(8.16) gNf=(-lPf Ag 


and — this is the hardest part — that the exterior product is associative. 
Exercise 6. The exterior product of two differential forms is defined by 
the previous exercise. Show that 


(8.17) d(wA @w)=dwhw+(-1)?wAdw 
if w is of degree p. 


The notion of inverse image formulated in n° 6 for forms of degree 1 can 
be trivially made to apply to the general case: given a map f : U —> V and 
a form w of degree p on V, the inverse image of w under f is the form 
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(8.18) Wo f : (a, hy, ee) Rp) —> W [f (x); f'(x)ha, rr) f'(x)hp] 3 


as in the case of the differentiation of a composite functions and of forms of 
degree 1. This is the only definition conceivable given the data available; it can 
be generalized to any tensor field T of type (p, 0) since antisymmetry obvious 
plays no role in the definition. The reader will easily prove the following 
formulas : 


(8.19) wo(gof)=wog)of, 
(8.20) WAm)of=(wof)A\(wof), 
(8.21) d(wo f) =dwof 


They are just trivial consequences of multivariate chain rule: apply defini- 
tions. They can even be generalized to tensor fields, provided exterior differ- 
entiation is replaced by covariant differentiation (3.10). 


9 — Extended Integrals over a 2-Dimensional Path 


Physicists say that if we consider a surface bounded by a regular curve in an 
electromagnetic field, the flow of the field through the surface is equal to the 
circulation of the electric current vector around its boundary. This fundamen- 
tal law, experimentally discovered by Ampére and Faraday in geometrically 
trivial cases — for example, a circular plane surface —, was formulated math- 
ematically by Maxwell about 1870 by taking account of the fact that the 
“magnetic field” vector is the rotational of the “electric” vector. It is based 
on a precise mathematical result, Stokes’ formula which gives a relationship 
between curvilinear integrals and “surface” integrals over R°. 

With their practice of mathematical conjuring tricks and their posses- 
sion of what a recent author®® calls — with admiration? irony? — a powerful 
weapon: a striking intuttion, literally based on centuries of collective expe- 
rience, physicists give almost instantaneous proofs of this;?° these astound 
mathematicians who, having stopped arguing as they did hundred fifty years 


35 Michel Talagrand, Verres de spin et optimisation combinatoire (talk in the N. 
Bourbaki Seminar, n° 859, Mars 1999, p.8) 

36 In Paris, physicists have even been seen to write Taylor’s formula, Maxwell’s 
equations, Stokes’ formula, and chemists to calculate eigenfunctions of 
Schrodinger’s operator for the hydrogen atom, in front of first year students 
not having yet understood or even learnt what a partial derivative was, and to 
reproach mathematicians who argue correctly and in the natural pedagogical or- 
der of making students “lose their time”. Apparently, a number of physicists do 
not understand that mathematicians give as much importance to rigour as they 
give to experiments. Besides, it is interesting to observe that the hostility with 
which many physicists regard abstract or modern mathematics does not extend 
to modern physics, some sectors of which, like quantum mechanics invented in 
the same period, are all the same quite abstract and “modern” too. This evolu- 
tion can also be understood by comparing chemistry and biology classes of the 
1930s with those taught today, especially in high schools. 
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ago, laboriously reach the result. In fact, Stokes’ formula is one the hardest 
to prove rigorously in an elementary manner; and even using the highest level 
of mathematics, no one knows how to exactly define the right category for 
“surfaces” (of arbitrary dimension) to apply it to; the problem is including 
integration domains whose “edges” are sufficiently regular for integration to 
be possible on it, while including sufficiently general singularities so as not 
to exclude important practical or theoretical cases. For example, the border 
of a polyhedra is not a “smooth” surface; it has sharp corners and edges. 
Excluding such a simple example from the application domain of Stokes’ for- 
mula would run counter to common sense, but proving it in a sufficiently 
general framework so as to include polyhedrons (in other words, “simplicial 
complexes” from algebraic topology) poses substantial difficulties since at the 
same time the case of perfectly smooth surfaces has to be covered. Physicists 
reply that the surface of a tetrahedron is in reality made of four perfectly 
smooth triangles and that the edges do not count in the integration, or that 
“corners may be cut” without significantly changing the result. True, but 
they obviously do not have to prove it. It is for good reason that this prob- 
lem is at the origin of the Bourbaki group; in the 1930s, when its program 
was to write a usable treatise for university teaching, a mathematician of 
the level of Andre Weil asked Henri Cartan if he knew a good method for 
proving the formula, it being agreed that all mathematicians — I used to do 
in in 1947 — have always been able to present the type of “proof” that satisfy 
physicists. 


(i) The exterior derivative as an infinitesimal integral. Let us again con- 
sider a differential form p;(x)h’ = w(a2;h) of degree 1 and class C' on an 
open subset G of a Cartesian space EF. Take a point x € G, and fix two 
vectors h and k& and, for given s,t € R, consider the plane parallelogram 
P(a,sh,tk) = P, ie. the set of points of the form x + uh + vk, where u 
(resp. v) vary between 0 and s (resp. t); assume s and ¢ to be sufficiently 
small so that P can be contained in G. When we follow the sides of this 
parallelogram in the direction indicated in fig. 2, the border of P is trans- 
formed into a closed integration path written OP, the boundary of P. Let us 
calculate the integral I(a; sh, tk) of w along this path. On the side connecting 
x to x+ sh, the parametric representation ut> «+ uh can be used, which 
gives a contribution equal to 


| w(a + uh;h)du. 
0 


Contributions from the other sides are calculated in a similar fashion. 
I(x; sh,tk) is thus seen to be equal to 
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s t 0) 
if w(a + uh;h)du + | w(x + sh+vk;k)dv +f w(x + uh + tk; h)du + 
0 0 s 
0 
+f w(x +vk;k)dv, 
t 


and so 


(9.1) I(a; sh,tk) = [ [w(a + sh + vk; k) — w(x + vk; k)] du — 
0 
-[ [w(a + uh + tk; h) — w(a + uh; h)] du. 
0 


As the functions of (s,t, u,v) being integrated are C?, it is for instance pos- 
sible to differentiate under the [ sign with respect to s. By the FT, the 
derivative of the second integral is w(x + sh + tk;h) — w(x + sh;h). To dif- 
ferentiate the first one, the second term in the substraction can be omitted 
since it does not depend on s; to differentiate the first one, use the definition 


(9.2) w' (a+ sh;h,k) = £ le + sh;k) 
8 


of the covariant derivative and so finally, 


d £ 
(9.3) Get (ti 8h, tk) -| w' (a + sh + vk;h,k)dv — 
0 


[w(a + sh+tk;h) —w(a+ sh;h)] . 


Now differentiate with respect to t; by the FT, the derivative of the integral 
is w'(w + sh+tk;h,k); by (2), that of the expression between [ ] is equal to 
w'(a + sh+tk;k,h) since w(a + sh;h) does not depend on t. It follows that 


2 
(9.4) ois sh, tk) =w'(v+sh+tk;h,k) —w'(a@+sh+tk;k,h) = 


dtds 
= dw(a + sh+tk;h,k). 
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In particular, 


2 


d 
dw(a;h,k) = Fda | (ts sh tk) for s=t=0. 


Having done this, we can “retrace” our calculations; as the right hand side 
of (3) is clearly zero for t = 0, the FT and (4) show that 


d t 
Get (ti sh tk) =} dus(x + sh + vk; h, k).dv; 
5 0 
and as I(x; sh, tk) = 0 for s = 0, it also follows that 
8 t 
(9.5) I(x; sh,tk) = i; au | dw(a + uh + vk; h,k)dv. 
0 0 


Hence for s=t=1, 


(9.6) | w= // dw(a + uh + vk; h,k)dudv , 
AP(a;h,k) 2 


which is an extended double integral over the square J? = I x I in the plane; 
this supposes that h and k are sufficiently small so that the surface of the 
parallelogram P(a;h,k) can be contained in the open subset on which w is 
defined. 

The simplest case can be obtained by assuming that w = pdx + qdy is a 
form on R?, and so 


(9.7) dw = (Dig — Dop)dx A dy = piodx A dy. 
(4) can then be written 
(9.8) | pdx + qdy = // pi2(x + uh + vk) (hk? — h?k') dudv, 
aP P 
where P is a parallelogram with initial point « generated by the vectors h 
and k. In particular, if « = 0 and if h and k are the unit vectors of the 


coordinate axes, then we get the Green-Riemann formula, unless it be the 
Gauss formula, 


(9.9) | pdx + qdy = // (D1q — Dap) dady 
ar r 


for the square I, provided the boundary OJ? of I? is given the usual positive 
orientation. Cauchy proved it directly: 


ao D,qdrdy = [ yf Dyq(a, y)dx = i. [a(1,y) — (0, y)] dy = 


= | qdy 
ol? 
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since the integral qdy over the horizontal sides of I? is obviously zero. He used 
this calculation to show that the integral of a holomorphic function along the 
boundary of a square is zero, the relation D,q — Dep = 0 being just its 
holomorphy condition in this case. Gauss, Green, Cauchy, Stokes, Riemann : 
many fathers for essentially the same result. There is also an Ostrogradsky 
in dimension three. 


(ii) Stokes’ formula for a 2-dimensional path. Replacing the map (s,t) 
x+sh+tk by aC? mapo:I x I—+GonI x / in the sense of §1, n° 2, 
(iv) leads to a generalization: o is of class C? on the open square and its 
derivatives of order < 2 can be extended by continuity to the closed square. 
As already seen, o defines two families of paths in G: 


(9.10) ls :t + a(s,t) 
and 
(9.11) wy4:58H-+0(s,t). 


Given a differential form w of class C! and degree 1 on the open subset 
G, set 


(9.12) F(s) = F(us) =| - | w |[a(s,t); Doa(s, t)] dt. 


s 


Let us compute the derivative of F'(s) by direct calculations that will produce 
a less primitive version of Stokes’ formula than (6); it will be seen further 
down that that the same final result can be obtained more quickly by us- 
ing Green-Riemann, but it is necessary to accustom the reader to using the 
multivariate chain rule... 

The f sign will always denote an extended integral over I = (0, 1]. 

We start, in telegraphic style, from the formula 


(9.13) Fis = i [w(o, Doo)| dt. 
Therefore, 
(9.14) Dy [w (0; Doo)] =< {u [a(s);h(s)]} 


needs to be computed, where x(s) = o(s,t), h(s) = Dzo(s,t) for fixed t. The 
multivariate chain rule shows that 


Dy, {w [x(s), h(s)]} = [2(s); Diz(s), h(s)] + [2(s); Dih(s)] , 
and so 


(9.15) D, [w(o; Dzo)] = w! (0; Do, Doo) + w (0; D1 Dao) . 
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But the proof of (15) also shows that 


Do [w(o; Dic)| = w (0; Doo, Dic) + w (0; D2Dic) = 
= w! (0; Doo, Dio) + w (a; Di D20) , 


and so, by (15) and the definition of dw, 
(9.16) Dy |w (0; Doo)] = De |w (0; Dio)| + dw (a; Dia; Dao) . 


Hence, by (15), it follows that 
(9.17) eG [> [w (a; Dio)| dt + / dw (a; Dio, Dgo) dt. 


As Dz = d/dt, the first integral is the change of w(o; Dic) between t = 0 
and t = 1. Integrating F’(s) over (0,1), we then get (FT) 


(2.18) (1) = F(uo) = f wlo(s,1);Dio(s,1)} ds — 
- fal (a (8,0) ; Dio (s,0)] ds + 
+f dw (0; Dio; Doc) .dsdt . 
Using the paths yz, and ™ defined above, (18) can be written as 


(9.19)  F (ju) — F (0) = F(1) - vo) + ff det diel 


If the four integration paths occurring in (19) are concatenated into a single 
path 0c : [0,4] —> G given by the formulas 


a(0,t) = polt) (O<t<1) 
eines o(t—1,1)=(t-1) (1<t<2) 
o(1,3-t) =mi(3-t) (2<t<3) 
o(4—t,0) = w(4-t) (8<t<4), 


relation (19) becomes 


(9.20) [w= [fa 


provided, generally speaking, that we set 
(9.21) i w= ii aw [a(s,t); Dio(s,t), Doo(s, t)| dsdt 
o I? 


for any sufficiently differentiable 2-dimensional path o : I? —> G and any 
differential form of degree 2 on G; the similarity with the definition of a 
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0 


Fig. 9.3. 


curvilinear integral is clear, and the reader will surely think of generalizing 
it to forms of arbitrary degree. 

When the initial form w is closed, the double integral vanishes from (20), 
which, therefore, indicates that the integral of w along the closed path 0a 
is zero. We thereby recover the invariance of the integral under homotopy 
(modulo obvious conditions: fixed endpoints, or contour remaining closed), 
but by imposing differentiability conditions on o that are too strong. 

Exercise. Generalize the proof to linear homotopies between paths of class 
C'/? [imitate the calculations of Chapter VIII, n° 3, (iii)]. 

Formula (20) is one of the possible versions of Stokes’ formula in dimension 
two in a case which, compared to the physicists’ traditional version where in- 
tegration is over an excellent perfectly smooth surface, can seemingly present 
pathological aspects: the image of the square I? under o can have all sorts of 
singularities — sharp corners, edges, pleats,°” etc. — if the rank (n° 2, (v)) of 
a is not assumed to be everywhere equal to 2, i.e. maximum. Moreover, as 7 
is not assumed to be injective, even if o has maximum rank everywhere, the 
image of J? may resemble a paper sheet or a tube with multiple crossings, 
analogous to a curve in dimension 2 with multiple points. 


(iii) Integral of an inverse image. The expression integrated on the right 
hand side is clearly the inverse image woo of w under o, defined at the end 
of n° 8. Hence, by definition, 


(9.21’) [f== ff 202 


37 Consider the map (s,t) + (s?,t”) de J x J on the plane, where J = [—1, +1], or 
the map (s,t) + (sin? s,t) on the strip 0 <t< 1. 
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for any form w of degree 2 and any 2-dimensional path o : I? —> X of 
class C", 

Denoting by o*(s,t) the coordinates of o(s,t) with respect to a basis for 
E, if 


w= pijdz’ A da! 3 


with respect to this basis, then the expression integrated can be calculated 
by replacing the dz* by the do’ = D,a"(s,t)ds+ D20°(s, t)dt and so dx’ \dx4 
by 


J'3(s,t)ds\dt where J‘? = D,o*.D20? — Dio .Dga° 


is the Jacobian of the functions o’ and o/ with respect to s,t. So the right 
hand side of (21’) reduces to the classical double integral 


(9.22) | [vs [o(s, t)] Fi(s,t).dsat = ff r(s,t)dsdt. 


The computation is immediate as long as the mechanism of exterior products 
and inverse images has been understood. 

Version (20) of the result obtained suggests a far quicker proof, which 
amount to applying Gauss’ formula (9) to the square I? = K. Indeed, the 
left hand side is by definition a curvilinear integral: the integral over I of 
the inverse image woo. But denoting by 0K the path with initial point 0 
consisting in following the border of K counterclockwise, clearly 


Oo =a 00K, 


and so wo dao =wo(a00K) = (woa)o OK by (8.19). The left hand side 
of (20) is, therefore, the integral of woo along the path OK. On the other 
hand, by (8.21), 


dwog=d(woa). 


Setting woo = 0 to be a form of degree 1 on K, by (21’), relation (20) means 


| // 


which reduces to (9) as expected. 

The preceding calculation can be generalized. Consider two Cartesian 
spaces E' and F’,, two open sets U C EF and V C F, a map f: U — V and 
a form a of degree 2 on V. Let o : I? —+ U be a 2-dimensional path in U. 
This gives a path f oa: I? —+ V in V. Having said this, 


(9.23) [feer=]f 
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Indeed the two sides are, by definition, the integrals over I? of the forms 
(wo f)ooa and wo (co f). It, therefore, suffices to use (8.19) in order to 
obtain (23). 


(iv) A planar example. Consider the simplest case: E = R?. There are 
two real C! functions fg and f; on J. Set 


Ho(t) = (t, folt)) a(t) = (4, AA) 


so that these two paths amount to following the graphs of fo and f,. Under 
the corresponding linear homotopy 


(9.24) a(s,t) = (1—s)uo(t) + sui (t) = 
= (t,(1 — s)fo(t) + sfi(t)) 


Fig. 9.4. 


the image of I x J is the subset of R? bounded by these graphs and by the 
verticals with coordinates 0 and 1 with respect to the x-axis, the boundary 
Oo of o together with the direction followed being indicated in the above 
figure. If 


w = pdx + qdy 


is a form of degree 1, its integral can be computed along 0c by choosing 
parameters y along the vertical paths and x along the graphs of fo and /f; ; 
thus 


1 fi() 
| we | {p [a, fol) + 4 [a, fo(a)] fi(a)} de + : at iias 
Oo 0 f 


0 
rs | (ple, fio)] + ale, fle) fo(a)} de + [ oom 
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These four simple oriented integrals correspond to the obvious four paths 
making up Oo . As for the double integral of 


dw = (Diq — Dap) dx A dy = r(x, y)dx A dy 
occurring in (20), it is calculated by replacing (x,y) by o(s,t) and dx A dy 
by J,(s,t)dsdt, where 


Jo(s,t) = = fo(t) — filt) 


is the Jacobian of map (24). Here, integral (21’) can, therefore, be written 


[fri - fol) + sh) -[folt) — 0) asa. 


Replacing t by «a and making a change of variable y = 
(1—s)fo(x) + sf1(a) in the integration with respect to s, we get the integral 


i wf” r(x, y)dy, 
fo(2) 


where the integral with respect to y is oriented. If the Lebesgue-Fubini for- 
mula (33.7) of Chapter V, §9 is naively applied, the result seems to be just 
the extended ordinary double integral of r over the compact subset A of R? 
bounded by the graphs of fp and f; and the verticals « = 0 and x = 1. This is 
the case only if fo(t) < fi(¢) for all t. In fact, denoting by A, (resp. A_) the 
subset of A on which the Jacobian f;(2) — fo() is positive (resp. negative), 
(20) can be written as 


(9.25) 7 pdx + qdy = / (D1q — Dap) drdy — 
Oo Ay 
-| (Dig — Dap) dady. 


This corresponds to the fact the the path 0c, which is the image under o of 
the border of the square I?, is made up of several closed simple curves, some 
oriented counterclockwise, others clockwise. If fi(¢) > fo(t) everywhere, we 
recover the Green-Riemann formula in a slightly more general case, but as 
simple to prove directly with the help of a small Cauchy calculation and of 
the most elementary version of Lebesgue-Fubini. 


(v) Classical version. In classical analysis, there was a “surface” S — 
a sphere, a torus, a somewhat deformed rectangle, etc. — and a vector field 
(P,Q, R) in the usual three dimensional Euclidean space; the aim was to 
define the extended integral | Pdydz + Qdzdx + Rdxdy over S. The general 
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method consisted in using a “bi-univocal parametric representation”, i.e. 
bijective, of S, given by functions 


(9.26) z=y(s,t), y=w(s,t), 2z=(s,t) 


of two real “parameters” s,t varying in a subset A of R? bounded by one or 
many simple curves such as those encountered in Cauchy’s residue formula; 
this amounts to setting 


o(s,t) = (p(s, t), ¥(s, t), A(s, t)) 
Having said this, dydz, dzdx, dxdy had to be replaced by 


D(, 6) D(6, 9) D(y,b) 
D(s,t) D(s,t) D(s,t) 


(9.27) dsdt , dsdt , dsdt 


and x,y, z by their expressions in terms of s and t in the function P,Q, R; an 
expression of the form r(s, t)dsdt was thus obtained and integrated over the 
domain of variation A of (s,¢). To make sure that S was indeed a “smooth” 
surface, without sharp corners, edges or any other singularities, it was as- 
sumed that the above three Jacobians were never simultaneously zero, in 
other words, that the tangent linear map to o at (s,t) was everywhere injec- 
tive (the necessity of this assumption regarding submanifolds of a Cartesian 
space will be explained in n° 13, (iv)) or else that o was of rank 2; the image 
of R? under the tangent map o’(s,t) was then the “tangent plane” the the 
surface S' at the point o(s,t). 

Clearly, the expression Pdydz + Qdzdx + Rdxdy to be integrated is just 
the differential form 


w = Pdy \dz+ Qdz A dx + Rdx A dy 


on R? and, if K is the unit square J”, then the surface integral to be computed 
is just the integral (21’) of w extended to the path o. 

The integral thus defined also had to be shown to depend solely on the 
given vector field and surface $, and not on the chosen parametric represen- 
tation; as will be seen in the next n°, this was ensured by a change of variable 
formula in ordinary double integrals, the only difficulty residing in its proof. 

As for (20), it is the traditional Stokes formula; it was applied by assum- 
ing that the vector field (P,Q, R) to be integrated was the rotational of a 
vector field H. The integral of H along the curve C' bounding S gave the 
left hand side of (20). The expression to be integrated was often described 
as the scalar product of H and of an infinitesimal vector with components 
dx, dy, dz, written dM, where the letter M denotes a variable point of the 
curve. Physicists wrote the double integral in a similar way by regarding ex- 
pressions (27) as the components of the metaphysical vector dS normal to S$ 
(i.e. orthogonal to the tangent plane to S$) at the point considered and with 
length “the element of infinitesimal surface” of S, whatever that meant, so 
that (20) finally became 
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[ au = |] rot H.dS.. 
C 8 


After much thought, it was possible to understand that dS was the “vector 
product” of the vectors with initial point M(s,t) and terminal points the 
points M(s+ds,t) and M(s,t+dt) of the surface, in other words, in our no- 
tation, the vectors D,0(s,t)ds and Dga(s,t)dt generating the tangent plane 
to S at the point M considered. As the classical “vector product” h A k of 
two vectors is orthogonal to them and its length is the area of the paral- 
lelogram given by h and k, saying that the components of the vector dS' of 
physicists are the expressions (27) amounts to choosing, at each point M € S, 
an orientation of the normal to S at M or, equivalently, a “ positive direction 
of rotation” in the tangent plane to the surface S, namely that which takes 
the vector D,o(s,t) to the vector Dga(s,t); this choice determines the ori- 
entation of the normal: its unit vector together with the previous two vectors 
must form a “direct” trihedron. 

Since the left hand side supposes the border curve C' to be oriented and 
changes sign if the orientation is reversed, the surface integral inevitably 
involved a question of orientation. This difficulty was overcome by orienting 
coherently the normals to S and by orienting C' accordingly. The “coherence” 
of the orientations of the normals to S meant (more accurately, presupposed) 
that the unit vector of the oriented normal at a point M € S had to a con- 
tinuous function of M. As will be shown in n° 13, (iv), any smooth surface 
admits a parametric representation (26) in the neighbourhood of one of its 
points, and even, up to a permutation of its canonical coordinates, an equa- 
tion z = f(x,y); so its normals can be oriented in the neighbourhood of each 
of its points. When a global orientation can be found, the surface S is said 
to be orientable. The well-known “Mobius strip” is not so (it is the image 
of a square under a map o of rank 2 everywhere, but which is not injective 
since, to obtain a closed strip, the images of two of the opposite sides need to 
be identical) and in this case there are no Stokes formula in the traditional 
sense. 

The orientation of C’ was then chosen by a simple rule. In good cases — 
physicists do not consider any other —, the curve C is indeed, as seen above, 
the image under o of the boundary of the set A C R? in which (s,t) varies. 
If S is oriented in this manner — i.e. by using the map o to transfer to S the 
positive rotational direction in R? —, then C' must be oriented by transferring 
the traditional “positive” orientation of the boundary A by using Oo. It 
was then explained that if you follow the curve in the chosen direction while 
remaining constantly upright on the tangent plane to S so that the normal 
vector having your feet as initial and coming out of your head be oriented 
like the normal to the surface, then looking straight in front of you, you 
should see the surface on your left. Maxwell’s immortal corkscrew rule was 
also available. 

All this mess made mathematicians with “modernist” tendencies laugh 
or repelled them, as the case may be, especially those inspired by Elie Cartan. 


6 
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Nevertheless, it should be mentioned that he spent far more efforts to give 
the theory a formal concise and esthetic aspect than to justify its formulas: 
a quite impossible task before the modern development of topology and of 
the theory of differentiable manifolds and which continues to raise serious 
problems today.°% 


10 — Change of Variables in a Multiple Integral 


Thus in the classical version, a “surface integral” needed to be shown to be 
solely dependent on the geometric surface and not on its parametric repre- 
sentation (9.26). It is exact — up to sign as in dimension one where a question 
of orientation arises — for injective maps o of maximum rank everywhere, 
but even in this case the answer is obviously not easily available: the corre- 
sponding statement in dimension one already requiring the change of variable 
formula in a simple integral, the same will necessarily hold in dimension two. 
Indeed, suppose that o is replaced by coy where y is a diffeomorphism from 
I? to I?, which leaves the image of the square invariant. The integral over I? 
of 


wog=r(s,t)ds A dt 
is then replaced by that of 
wo(aoy)=(woa)oyp=r|y(s,t)] Jy(s,t)ds A dt. 


To solve the problem, it is thus necessary to already know that 


(10.1) If. r(s, t)dsdt = If. r [y(s, t)] Jp(s, t)dsdt 


holds up to sign (orientation !) ; it is a particular case of the change of variable 
formula in multiple integrals which will be proved in this n°. Indeed, if y is 
a diffeomorphism, its Jacobian does not vanish in K, hence its sign remains 
constant there. If the function r is positive, then so is the second integral. 
Hence the sign to be used is necessarily that of Jy(s,t), which will be written 
sgn(y). In other words, the correct formula is 


(10.2) i he r [y(s, t)] Jo(s, t)dsdt = sgn(v) / ig r(x, y)dady 


or, equivalently, 


(10.3) [f reetl-Wolstidsat= ff ra,n)deay. 


38 Henri Cartan, Calcul différentiel, still need nineteen pages to prove Stokes’ for- 
mula and that of change of variable in the most simple case of a compact subset 
of R? bounded by a reasonable curve. 
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This formula can be made to apply to much more general situations: 


Theorem 4. Let U C R” be an open bounded set, A its closure and yp a 
diffeomorphism of class C! from A to a compact subset B C R”. Suppose 
that the borders of A and B have measure zero. Then, for any integrable®® 
function f on B, 


(10.4) / Oe I fle] .|Jp(t)| at 


denoting by x = (at,...,2”) or t = (t,...,¢") the integration variable, by 
dm(x) = dx = dz'...dx” or dt the Lebesgue measure on R” and writing 
J for a multiple integral. This is similar to what happens in a non-oriented 
simple integral in which a strictly monotone change of variable « = y(t) is 
being carried out: the factor y’(t) involved in the oriented integrals has to be 
replaced by its absolute value: 


f fede = f FlelO) el 


since otherwise the sides could have opposite signs. 

We will prove the formula for continuous functions. By n° 10 and 11 of 
Chapter V, §2 it can then be immediately generalized to Isc (resp. usc) func- 
tions; for this, simply observe that if there is an increasing (resp. decreasing) 
philtre @ for continuous functions, then the functions f[p(s, t)]|J,,(s, t)|,where 
f € ®, form an increasing (resp. decreasing) philtre of continuous functions 
on A. Hence it is possible to pass to the limit under the f{ sign on both sides 
of this formula. Lebesgue’s theorems lead to the general case in a few lines. 
In particular, it applies if f is the characteristic function of an open or closed 
subset of y(A), hence of the form y(M/) where M C A is an open or closed 
subset in A. We find the measure of y(1/) on the left hand side; on the right, 
we integrate the characteristic function of M, whence 


(10.4’) mlp(M)| = ] |Je(t)| dt. 
M 


(i) Case where — is linear. This is the simplest case and we will give 
a proof using classical properties*® of the group GL,(R) = G of n x n real 


3° In the sense of Lebesgue. 

40 As shown by K. Iwasawa about 1950 in an article rightly famous, in particu- 
lar lemma d below if it is properly interpreted, these properties generalize to 
all semisimple Lie groups, a class of groups singled out by Elie Cartan whose 
extraordinary properties continue to be the subject of numerous studies mix- 
ing algebraic geometry, number theory, generalizations of the theory of modular 
functions, non-commutative harmonic analysis (there is a quite sophisticated ver- 
sion of Fourier transforms for these groups), PDEs, etc. Apart from “classical” 
groups such as the matrix group equipped with a symmetric or alternating bi- 
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invertible matrices and, contrary to some shorter ones, requiring little integral 
calculations. 

Set # = R” and let L(E£) be the set of continuous functions with compact 
support in E. We will not differentiate between a matrix g € G and the linear 
map x +> g(x) corresponding to it on F and for which g is the matrix with 
respect to the canonical basis (e;): the formulas 


(10.5) g(es)=g1e;, g(x) = gi’, 


where the g(x)’ are the canonical coordinates of the vector g(a), in conformity 
with tensor conventions, then follow. The Jacobian of y = g(x) with respect 
to x is then equal to det(g), so that it suffices to prove that for all f € L(E), 


(10.6) [ flo) )| dx = | det(g)|~* [tae 


where integration is over all of E. 


Lemma a. Let 4 be a Radon measure on R”. Assume ys to be invariant 
under translations. Then ps is proportional to the Lebesque measure m. 


Note first that, for any f,g € L(E),the function (x,y) 4 f(x)g(y — x) 
has compact support in EF x EF, since if f and g are zero outside the compact 
sets M and N, then it can be # 0 only if (x,y) belongs to the compact 
set M x (M +N). The most elementary Lebesgue-Fubini theorem can then 
be applied to this functions, and in the following, formal calculations are 
possible. Having said this, 


=f f(w)oly)dmi( n= ff Herat y — x)dm(x)duly) 


by the change of variable y > y— x in the integration with respect to y; the 
change of variable x +> x +4 y in the integration with respect to m then gives 


n= ff tarnal —x)dm(x)du(y) = 


n  haca )am(x) f Fe+ v)dn(y) = 
= n(1) f o(-2)am(a), 


linear form, this category contains “exceptional” groups whose construction is 
far less obvious. 

For a more elementary but far less instructive proof, see for example Rudin, 
Real and Complex Analysis, end of Chapter 8, where a proof of the complete re- 
sult analogous to ours can also be found, as well as more subtle results from the 
theory of integration. Dieudonné, Eléments d’analyse (vol. 3, XVI.22) uses the 
fact that, locally, all diffeomorphisms decompose into simpler maps (changing 
one variable at a time) for which the formula is more or less obvious, thereby 
avoiding all approximation calculations presented in parts (ii) and (iii) below. 
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Choosing g so that f g(—x)dm(a) = 1, 


uf) = wlg)m(f) 
for all f, qed. 
Lemma b. For every g € G, there is a number A(g) > 0 such that 


(10.7) | flo dm(a) = A(a) f F(e)am(a) 


for all f € L(E). 


Formula ju(f) = f f[g(a)]dm(«) defines a Radon measure on E and p is 
clearly invariant, hence the formula. A(g) > 0 because if f is positive, then 
so are the two integrals. 

As can now be seen, if g is the diagonal matrix (t,,...,¢,), in which case 
the left hand side of (7) is the integral of f(t,z', ...,t,2”), then the change 
of x? ++ t;2* shows that A(g) = |t1...tn|7+ =| det(g)|?. 


Lemma c. The map A is a continuous homomorphism from G to the mul- 
tiplicative group R*.. 


The relation A(gh) = A(g)A(h) can be obtained by apply lemma b twice. 
To show that A is continuous, observe fist that, like g(x), f[g(a)] is a con- 
tinuous function of the couple (g,2) € G x E; it is, moreover, zero outside 
g ‘(M), where M Cc E is the compact support of f; but when g varies 
in a compact subset of G, the set g~ (MM) remains in the image K of the 
compact set N x M under the map (g,z) +> g~ +(x). As g + g7', and so 
(g,2) ++ g~'(a) is also continuous, K is compact. Hence if g remains in a 
fixed compact subset N of G, as can be seen, the integral of lemma b in 
fact generalizes to a fixed compact subset of E. Continuity with respect to 
the parameter g € N is then clear (Chapter V, §2, n° 9, Theorem 9). Ob- 
serving that G being an open subset of M,,(IR), all g © G have a compact 
neighbourhood N, leads to the conclusion. 

The rest of the proof consists in determining all the continuous homomor- 
phisms from G to R4_: they are all of the form g +> | det(g)|*, with s € R. 


Lemma d. For every matrix g € G, there are orthogonal matrices u and v 
and a diagonal matriz t = (ty,...,tn), with t; > 0, such that g = utv. 


Denote the usual scalar product on R” by (aly) and write g’ for the 
transpose of a matrix g. It is characterized by the identity 


(g(x)ly) = (alg'(y)) - 


The orthogonal subgroupk = O,,(R) of G is the set of matrices such that 
g'g = 1, ie. such that ||g(x)|| = ||a|| for all x; it is obviously closed and 
bounded in M,,(R), and so compact. On the other hand, there are symmetric 
matrices in G, and even in M,,(R), i-e. such that h’ = h; for such matrices, 
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we know there is an orthonormal basis (a;) in R” and scalars t; € R such 
that h(a;) = t,a; for all ¢ (diagonalization), and conversely. Writing t for the 
diagonal matrix (t;,...,¢,) and u for the orthogonal matrix transforming the 
canonical basis (e;) into (a;), 


hu(e;) = h(a;) = tiu(e;) = u(tie;) = ut (e;) 
for all 4 (no summation over i, obviously !), and so hu = ut, ie. 
h=utu!. 
This result holds for all g € G and h = g’g. As 


(9'9(2)|x) = (g(a)|g(@)) > 0 


for all « 4 0, t; > 0 in this case. Then write h'/? for the operator given by 
h'/?(a;) = (t;)!/2a;; it is symmetric and 


(9(2)|9(y)) = (g'g(@)ly) = (22/2n/?(@)|y) = (14/2 (@)|n¥/2(y)) 


for all x and y, and so (gh~‘/?(x)|gh—1/?(y)) = (aly). The orthogonality of 
gh—'/? = w follows. But the argument showing that h = utu~! shows as well 
that h'/? = ut!/2u7!, and so, finally, 


g= wh'/? = wut/?u7! = vt!/2u7! , 


qed. 


Lemma e. Let K be a compact topological group and A a continuous ho- 
momorphism from G to the multiplicative group C*. Then |A(k)| =1 for all 
kek. 


The image of K under A is indeed a compact subgroup H of R* . For any 
t € H, the set of the t”(n € Z) must, therefore, be bounded, and so |t| = 1. 
Corollary : | det(w)| = 1 for all u € O,,(R). 


Lemma f. Any continuous homomorphism A from R*, to R*. is of the form 
A(t) =t* for some s ER. 


This is the characterization of power functions: Chapter IV, n° 6, Theo- 
rem 4. 

We can now return to the calculation of the factor A(g) of lemma b. 
Writing g = utv, by lemma e, A(g) = A(u)A(t)A(v) = A(E). If t is the 
diagonal matrix (1,...,1,t,1,...,1), where t > 0 is in the i*” place, A(t) is a 
power function of t by lemma f. As any positive diagonal matrix is a product 
of like matrices, we get a formula of type 


A(t) = #3! ...t8" 


with the s; € R a priori arbitrary. 
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But let us consider the linear operator wz which, for a given permuta- 


tion o of {1,...,n}, transforms e; into eg(;). Then we € K and wotwz* = 
(to(1),+++sto(n))- As A(wotwz') = A(t), 51 =... = 8, = 8 by lemma e, and 
so 

(10.8) A(t) = det(t)*. 


For arbitrary g € G, formula g = utv then shows that 
(10.9) A(g) = | det(g)]* 


with an absolute value since A(g) must be > 0. This is the general form of 
continuous homomorphisms from G'L,,(R) to R4. 

To finish the proof of the change of variable formula, it remains to observe 
that, if g(a) = ta where t € R is non-zero, then the factor A(g) of lemma b 


is clearly equal to |t/~”, and so s = —1. 
Exercise. Show that any g € GL,,(R) can be written as g = khu where k 
is orthogonal, h positive diagonal and wu triangular with diagonal (1,...,1); 


verify (6) for g = u by changing variables in the simple integrals and deduce 
(6) for g. (The decomposition g = khu means that any basis (a;) of R” can 
orthonormalized by a triangular linear map applied to the a; : Gram-Schmidt 
orthogonalization process. 

Exercise. Set G = GL,(C). Show that any continuous homomorphism 
from G to C* (resp. R*) is of the form g++ | det(g)|* det(g)? with s € C and 
pé€Z resp. s ER, pe {1,-1}]. 


(ii) Approximation Lemmas. We now return to the general case of the the- 
orem. To prove it, we need some preliminary results justifying what physicists 
take to be obvious, that in the neighbourhood of a point a, a diffeomorphism 
is approximately linear, and “so” multiplies the volumes by the absolute 
value of the determinant of its differential map. It then suffices to add the 
results and one easily obtains the general formula. .. 

In what follows, the length or the norm |u| of a vector u € R” is defined 
by 


(10.10) |u| = sup (ju"|,..., |u|) ; 


for this “cubic” norm, an open ball centered at a and of radius r is the set 
|u’ — a’| < r. To avoid confusion, it will be call the cube centered at a and of 
radius radius r, and will be written U(a,r) or K(a,r) according to whether 
the cube is open or closed. Because of this norm, a reasonable set can be 
approximately decomposed into parallelepipeds whose pairwise intersections 
are faces that are negligible in the integration. This operation cannot be 
performed using real Euclidean balls. As in any normed vector space, the 
norm of a linear map A is defined by 


|| All = sup |Aa|/|c. 
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The crucial point is the following lemma, in which K(r) = K(0,r). 


Lemma 1. Let U be an open subset of R” containing 0 and yy : U —+ R” a 
Ct map such that w(0) = 0, W’(0) = 1. Given a number gq such that 0 <q <1, 
let r be a number > 0 such that K(r) C U and 


(10.11) |x| <r => |\v'(z) -1|| <q. 
Then 
(10.12) K ((1—q@)r] Cv [K(r)| C K[(14+ @r] . 


The derivative ¢)’(a) being a continuous function of x equal to 1 at « = 0, 
the existence of r for given q > 0 is obvious. Having said that, suppose 
x € K(r), so that ta € K(r) for 0 < t < 1. The derivative of t H ~(tx) is 
wy (ta)x ; as (0) = 0, w(x) = fo’ (tx)adt, and so 


va) —2= f [w'(ta) a nade, 
where integration is over (0, 1). Since, by (11), 
li! (ta) — a < |" (tx) — 1 Jal < |]! (x) — Afr < ar 
h(a) — 2| <r and 0 
Wo) <lel+r star, 


which proves the right half of (12). 

To prove the other less easy inequality, imitating the proof of the local 
inversion theorem is a possibility. It all amounts to showing that, for all 
¢ € K[(1 — q)r], there exists z € K(r) such that w(z) = ¢. For this, set 


V(z) = z+ plz). So p'(z) = y"(z) — 1; by (14), 
(10.13) jz] <r => |p(z)| < alal- 
Then construct a sequence of points 
m=, t=C— ple); 2= Cpe) y va: 


as in Chap. III, §5, Theorem 24, whose proof we follow (except that, lack of 
foresight, g = 1/2 was chosen in Chapter III). We must make sure that the 
construction continues without obstruction, i.e. that z; € K(r) — obvious — 
and that z, € K(r) implies z,41 € K(r). Now, by (13), 


Jzntil SC] + pln)] < A-@r+dlenl <r. 
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Having done this, (13) shows that 


Enea a Zn| — \p(Zn) = D(Zn—1)| < ql2n = Pa | 


with q < 1. Hence lim z, = z € K(r) exists, with obviously w(z) = ¢, qed. 
Let us now suppose that the assumptions of theorem 4 hold. 


Lemma 2. For all q > 0, there exists r > 0 such that, for x,y € A, 
(10.14) Jz—yl Sr = |l9'@) -¢' WI sa/lle'(@)"I- 


The map «+> y’(a) being continuous on A and y’() being invertible for 
all z € A, the map x + y’(x)~! is also continuous on A: indeed Cramer’s 
formulas*! tell us how to calculate the entries of the matrix of y/(x)~! from 
those of the matrix of y/(x). The norm ||y’(x)~+]| is, therefore, also a con- 
tinuous function on A and, A being compact, it is bounded on A. Setting 
sup ||y’(xz)~"|| =1/M, M <1/||y’(x)~1|| for all x € A and (14) will hold if 


(10.15) Iz — yl Sr => lle"(@) — '(Y)I S Ma. 


But x +> y’(x) is uniformly continuous since A is compact. So for all q > 0, 
there exists r satisfying (15), qed. 

In the following statement, m(X) denotes the Lebesgue measure of a 
measurable set X C R”, as it happens a compact set. 


Lemma 3. For all q such that 0 < q <1, there exists r > 0 satisfying the 
following property: for alla € U such that K(a,r) = K CU, 


(10.16) lm [p(K)| —| Jp(@) |-mU)| < a.m(k). 


Take a point a € U and replace y by 


Ya: tr y'(a)* [plat 2) — v(a)], 


41 ‘There is an easier argument in the case of an arbitrary Banach space £; it is based 
on the fact that, for any linear operator T with norm < 1, the operator 1—T has 
an inverse, namely }> J” (the series converges absolutely since ||T || < q”, where 
q = ||T||). Let A and X be two continuous linear operators on E; set X = A—Y 
and suppose that A is invertible. Then, X = A(1— A~'Y), so that X is invertible 
if |A7'Y || =q<1;as||A-'Y|| < ||A7 "||| Yl, this is the case if ||Y|| < 1/||A7*||, 
ie. if X is sufficiently near A. Then, X~+ = (1—A7'Y)71A7* = SV(A7TtY)"A7F, 
and so 


|X~*|| < JA7*/ = @) < 2|A* | 
if |||] < 1/2\| A723]. As X-! — A7? = A“2(A — X)X72 = AY X72, 
|X~* — AM] < ATT — Al XT < ATP? X — All 


follows. Hence X ++ X~? is continuous at A. 
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where y’(a) is the tangent linear map to y at a. Then y,(0) = 0 and 
g(a) = 9'(a)*¢'(a +2), 
and so y/,(0) = id. Choose a number q’ €]0, 1[ such that 
(10.17) l=ge(l=9) 5 0t¢)"<1+9 


— the end of the proof will explain this bizarre condition — and apply lemma 
1 to Yq by replacing q by q’ in it. To be able to apply it, is suffices that y, 
be defined for |x| <r, ie. that K(a,r) C U, and that 


(10.18) jz] <r => |lva(z) - 1] <q’. 
But 
le. (z) -— Ul = Ile’@* "(a+ 2) - Ml = 


= |l'(@)* [y'(a+2)- Y'all < 
< |v’ (@)“" lea + 2) - ¢'(a)]. 


Condition (18) will, therefore, hold if, for z,y € U, 
(10.19) Jz-yl <r lew) -¢@l<d/le@"I. 


Lemma 2 shows that, for all q’ > 0, there exists r satisfying this condition. 
Hence r can indeed be chosen so that (18) holds for all a € U such that 
K(a,r) CU. 

Having done this, consider these points a € U. Lemma 1 applied to ya 
shows that 


K(r—q'r) C ga [K(r)] CK (rt+a'r) , 


where K(r) = K(0,r). Applying y’(a) to the terms of this relation, ya is 
replaced by the map z+> y(a+x)— (a); the image of K(r) under this map 
is*? yla + K(r)] — v(a) = y[K(a,r)] — v(a); hence, setting K = K(a,r) as 
above, 


g(a) [K (r—q'r)] C e(K) — gla) C y'(a) [K (r+ q'r)] , 
and so, applying the translation by the vector y(a), 
(10.20) y'(a) [K (r — q'r)] + (a) C eK) C p"(a) [K (r + 'r)] + g(a). 
But since y’(a) is linear, formula (6) shows that, for all r > 0, 


m{e'(a) [K(r)]} = [Jo(a)lm[K(r)] . 


#2 For a set E C R” and some b € R", the notation E + b denotes the image of E 
under the translation u+> u-+ b. In particular, K(r) +a = K(a,r). 
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The measure of K (1) being proportional to that of r”, (20) shows that 

(10.21) (1—q')" < m[p(K)] /|Jp(a)|lm(K) < (1+q’)" . 

However, condition (17) has been imposed on q’; it necessarily follows that 
1—q<m[p(K)]/|Jo(a)|m(K) <1 +4, 


and hence (16) holds, qed. 


Lemma 4. Let U be an open subset and y a C! map defined on U. Then 
y(M) has measure zero for any compact set*? of measure zero M CU. 


Let d > 0 be the distance from the compact subspace M to the border 
of U and M’ the set of « € U such that d(x, M) < d/2. It is also a compact 
set contained in U. Let k be an integer > 0 such that 1/2" < d/2 and let us 
take a grid of R” by hyperplanes defined by a single equation x’ = p/2*, with 
p€ Zandi € {1,...,n}. R” can thereby be decomposed into cubes of the 
form K (a, 1/2**1) whose pairwise intersections are at most faces of dimension 
<n-—1. Finitely many of these cubes intersect M non-trivially since M is 
bounded; they are all contained in MW’ since their diameter d;, is < d/2; and 
finally, they cover M. Let M;, be their union. y(M) C y(Mx) = Uv(4s), 
where K varies in the set of cubes K(a,1/2*+!) comprising My. 

To find an upper bound for the measures of these y(f‘), observe that, 
these cubes being convex, by the (FT), 


ole) — oy) = | oo! [te + (1 — #)y] (w — ya 
and so 


le(2) — PY) S |e — yl-lle'llc S |x — yl-Mle'llae 


for all x,y € K. In conclusion, since all cubes kK considered have the same 
diameter d;,, y(A) is contained in a cube of diameter < cd, where c = 
ly’ ||az’, a uniform norm on M’. The measure of a cube of diameter d being 
proportional to d”, it follows that 


m[p(K)] < em(K). 


However, m(M;,) = >> m(K), where summation is over the cubes comprising 
M;, because their pairwise intersections have measure zero since they are 
contained in the hyperplanes of R”. Also, 


43 There is a far stronger result: if U is an open subset of R", any C' map ¢ : 
U — R” transforms sets contained in U into sets of measure zero, without any 
compactness assumption. See Dieudonné, Eléments d’analyse, XV1.22, exercises 1 
and 2. More complete results can be found in Rudin, Real and Complex Analysis, 
chap. 8. 
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(10.22) m[e(M)] < mp (Ma)] < So m[p(K)] < So em(K) = em (My) . 


The Lemma will, therefore, follow once we have shown that limm(M;) = 0. 

But let us compare the sets My, and M;,41. To obtain the first one, take a 
erid of R” by the hyperplanes x = p/2*, whereas the second ones is obtained 
by using the hyperplanes x’ = q/2**1. Clearly, any cube K of the second grid 
is contained in at least one cube K’ of the first ; if kK meets MW, the same holds 
for Kk’. Hence My41 C My, which gives a decreasing sequence of compact sets 
contained in M. By definition, any point of M4; belongs to a cube of radius 
1/2* meeting M, and so is at a distance < 1/2" from M. Any point common 
to all the M;, is, therefore, at distance zero from M, in other words belongs 
to M since M is closed. 

However, when there is a decreasing sequence of closed sets*+ M;, with in- 
tersection M, we know that m(M) = limm(M,;,); this was shown in Chapter 
V, §2, end of n° 11, for an increasing sequence of open sets, but, as mentioned 
then, the result and the proof remain the same for a decreasing sequence of 
closed sets. Since here m(M) = 0, relation (22) shows that m[y(M)] = 0, 
qed. 


(iii) Change of variable formula. The general formula 


[1 (ade = f Flot a(t) |dt 


can now be proved by replacing A = U with simpler sets, unions of cubes, 
and then passing to the limit. 

Choose a number r > 0. For any integer k > 0, let us once again take 
a grid of R” by the hyperplanes x’ = p/2". Denote the finitely many cubes 
in this grid contained in U by Ky,..., Ay and let A, C U be their union. 
The pairwise intersections of these cubes being compact and having measure 
zero, the same holds (lemma 4) for their images; hence 


(10.23) [, flee )] -|J,()|dt = ye fle y(t)|dm(t). 


The function integrated in (23) being uniformly continuous on the compact 
set A, for any r > 0, k may be assumed to be sufficiently large for it to be 
constant, up to r, in each cube K;. Setting a; to be the centre of K; and 
b; = (aj), the general term of the right hand side of (23) is seen to be equal 
to f(b;)-|J,(a;)|m(K;), up to mUK;)r. 

Now, setting D; = y(Ki), Be = p(An) = UD; C B = (A), lemma 
4 shows that the pairwise intersections of the y(A,) have measure zero. So 
once again, 


(10.23’) a f(x)dx = > ff, f(x)dx 


44 more generally of “measurable” sets as Lebesgue’s complete theory will show. 
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As in (23), the uniform continuity of f shows that, for & sufficiently large, 
the general term of the right hand side of (23’) is equal to f(b;)m(D,;) up 
to m(D;)r. Since m(A;,) = S> m(K;) and m(B;,) = 35 m(D;), the left hand 
sides of (23) and (23’) are respectively equal to 

(10.24) S— fbi). Fe (aa)|/m(K5) up to m(Ax)r, 

(10.24’) S— f(bi)m(Di) up to m(Bx)r. 


But lemma 3 applies to the K; — they are contained in U — provided k 
sufficiently large. Then 


(10.25) m(D;) = |Jp(ai)|m(;) up to m(UK;)r, 


so that replacing sum (24) by sum (24’) the error made is bounded above by 


Yo lfCdlm(Ki) < [I flla D> mK) = | llam(As) - 


In view of the errors made while replacing the right hand sides of (23) and 
(23’) by the “Riemann” sums (24) and (24’) , the absolute value of the 
difference between these right hand sides is bounded above by 


m(Ax)r + m(Br)r + ||fllam(Ag)r - 


But clearly, m(A;) < m(A) and m(B;,) < m(B); the error found is, therefore, 
< cr, where c = (1+ ||f||4)m(A) + m(B) does not depend on r. 
This argument shows that, for all r > 0, the inequality 


<cr 


(10.26) J. Fle(d)) [Jolt iat fe tnyde 


holds for all sufficiently large k. Next consider what happens when k is re- 
placed by k+ 1. Each cube from the first grid is the the union of cubes from 
the second; if a cube from the first one is contained in U, those from the 
second one comprising it are necessarily so. Hence A, C Ax41, and so this 
time we get an increasing sequence of compact (and so “measurable” ) sets 
contained in U. Their union is equal to U since any a € U is at a distance 
> 0 from the border of U and so is contained in one of the cubes of A; for 
sufficiently large k. 

If our familiarity with the theory of integration goes slightly further than 
Chap. V, §2, n° 11, we can conclude that the extended integrals over Ay, and 
By, in (26) converge to the extended integrals over U and V as k —+ +00. 
r > 0 being arbitrary, it follows that 


[ te@hisetolae= f flaae 
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To finish the proof of theorem 4, it remains to observe that if the borders 
A-—U and B—V have measure zero,*° the previous integrals remain invariant 
if U and V are replaced by A and B. This indicates that the essential result 
is in fact the previous formula, which does not assume anything about the 
borders of U and V. 


(iv) Stokes’ formula for a p-dimensional path. The definition of the ex- 
tended integral of a form of degree 2 over a 2-dimensional path o can be 
generalized in an obvious manner: if w is a form of degree p on an open 
subset G of a Cartesian space E and if o : I? —> G is a a p-dimensional 
path (or singular cube) assumed to be of class at least C' so that the partial 
derivatives of o extend by continuity to the border of J”, then, by definition, 
set 


(10.27) i w= I, w [o(t); Dio(t),..., Dpa(t)] dt, 


where t = (t*,...,¢?) and where dt = dt ...dt? is the usual Lebesgue mea- 
sure. This obviously amounts to setting 


wog=r(t)dt'A...Adt? and fo=] r(t)dt 
o [Pp 


as in degree 2. If o is replaced by 0 o y, where y : I? —> I? is a diffeo- 
morphism, then, by the associativity formula (8.19), woo = @ is replaced 
by 


wo(oop) =(woo)op=mog. 


So r(s) is replaced by r[y(t)]J,(t). Formula (2), seen to be equivalent to 
Theorem 4, then shows that 


(10.28) | 2 = 880(9) fe. 


where sgn(y) is the, necessarily constant, sign of the Jacobian of y. 

This formula resembles the definition of the integral of a form w along a 
path o such as the integral over the cube of its inverse image under o, but 
they should not be confused: (28) is a theorem and not a definition. More 
generally, consider the open subsets U and V of two Cartesian spaces, a C! 
map f : U —> V and a path p-dimensional o in U. This gives a path fog 
in V. Then 


(10.29) [° = [wos 


45 By lemma 4, this is always the case when y can be extended to a C' function 
defined on an open set containing A. 
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for any differential form w of degree p on V, which is even more like (28). But 
(29) means that integrals over the cube of wo(f oa) and (wo f)oo are equal; 
these two forms being identical by (8.19), there is nothing to show. Relation 
(29) is, therefore, almost a tautology or, equivalently, a direct consequence of 
the multivariate chain rule formula (2.13). 


To take an example, let us return to the computations of n° 9, (ii) related 
to the effect of a homotopy o on the integral of a form w of degree 1 along a 
path; writing the final formula as 


(10.30) [on fw, 


it reduces to Gauss’ formula for the square I?. 

An analogue of the Green-Riemann formula for the cube exists in arbi- 
trary dimension; using Cauchy’s simple calculation, the proof is the same as 
in dimension 2. The only difficulty concerns the definition of the extended 
integral of a form w of degree p over OK for K = J?*'; it is the sum of 
extended integrals over the faces of the cube, but the signs + and — placed 
before these integrals need to be determined. However, denoting the canonical 
coordinates of R?*! by t°,...,¢#? gives formulas of type 


w= Spit) r...dti... Adt?, 
dw = 5° (-1)'Dipi(t)dt? A... dt”, 
so that the integral of dw is the sum of the integrals over K of the functions 


(—1)'D;p;(t). To calculate them, first integrate with respect to the corre- 
sponding t's. This gives (FT) 


(1) fp ti PE eel ees EP pode. a 
+(—1)*** fi, pi (#°,...,0,..., 8?) dt?...dé*... de? 
involving the extended integrals of w over the faces of the cube J?+!. More 
precisely, write Fe for the face t; = 1 of the cube and F; for the face t; = 0, 


and define the extended integrals of w over these faces by using the parametric 
representation 


(10.31) GY lise ca bacuesgte EF Uiveieg lezuaytn) 


in the first case and the the analogous formula in the second. If F' is a face 
of the cube, set 
(1) ifF=F}, 


(10.32) e(F) = (—)' fra. 
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The Green-Riemann formula can then be written 


(10.33) [we=-Dee fe, 


where, as mentioned above, the extended integrals of w over the faces F of 
K = I*" are defined by using the parametric representations (31), which 
amounts to regarding these faces as p-dimensional singular cubes. 

To go from here to a formula of Stokes type for a form w of degree p and 
an arbitrary path of dimension p+1 in an open subset G of a Cartesian space, 
we need to define what will be meant by the extended integral of w over 0a; 
denoting the parametric representation (31) of the face F' by yp, it will be 
the expression 


fierce [ e=rew f woe. 


PF 


Using formula (33) for the cube, Stokes’ formula for o, 


| w= : dw 
Oo o 
then becomes trivial. 


The presence of the signs e(F’) will become clear in n° 16 where a slightly 
different version of Stokes’ formula will be proved; as in dimension 1 or 2, 
it corresponds to the necessity of choosing an “orientation” for each face F’ 
of K. 

Exercise. Let 09, 01 : 1? —> G be two p-dimensional paths in an open 
subset G of a Cartesian space, coinciding on the border of I? (the analogue 
of two one-dimensional paths with the same endpoints). They will be said to 
be fixed-border-homotopic if there is a path 0 : I x I? —> G satisfying the 
following conditions: (i) 7(0,¢) = oo(t), o(1,t) = o1(t) for all t € I”, (ii) for 
every point ¢ in the border of I”, o(s,t) is independent of s. Show that, if w 
is a closed form of degree p on G, then the extended integrals of w over oo 
and o, are equal. Similarly, generalize the invariance under homotopy of the 
integral over a closed path. 
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§ 4. Differential Manifolds 


This § gives a very summary treatment of the simplest aspects of the theory 
of differential manifolds. Even just considering its most basic notions, it has 
several other aspects. There are many excellent presentations of the subject 
and far more complete than ours.*® 


11 — What is a Manifold ? 


(i) The sphere in R°. To understand this problem, consider the unit sphere 
X having equation x? + y? + 2? = 1 in R?. What would be a reasonable way 
to define differential functions on an open subset of X ? 

A first condition they need to satisfy is to be continuous and defined by 
properties of a local nature. Then consider a point (a,b,c) € X and a func- 
tion f defined and continuous in the neighbourhood of this point. Suppose 
that c > 0. The upper hemisphere H; : z > 0 of X is an open subset of X 
and also the graph of a C™ function 


— (1-2? =a) 
defined on the open subset x? + y? < 1 of R?. Setting 


p(x, y; 2) = (x,y), 


define a homeomorphism from H, onto an open subset of R? transforming 
f into a function of (x,y), defined and continuous in the neighbourhood of 
the point y(a, b,c) € R?. It is then natural to say that f is of class C” in the 
neighbourhood of (a,b,c) if, as a function of (x,y), it is of class C” in the 
classical sense. We adopt the same convention if c < 0, i.e. if we are in the 
lower hemisphere H_ of X, the graph of the function 
g=—-(1-2?-y)'” : 
If c= 0, perhaps b < 0; then replace Hy by the hemisphere y < 0, which is 
the graph of the equation 
y=-(1-2-2)'” ‘ 
Formula v(x, y,z) = (x, z) again defines a homeomorphism from this hemi- 
sphere onto an open subset of the plane and differentiable function in the 


46 Marcel Berger and Bernard Gostiaux, Géométrie différentielle (A. Colin, 1972), 
Paul Malliavin, Géométrie différentielle intrinséque (Hermann, 1972), Pham 
Mau Quan, Introduction a la géométrie des variétés différentiables (Dunod, 
1969), Frank W. Warner, Foundations of Differential Manifolds and Lie Groups 
(Scott, Foresman, 1971), Michael Spivak, A Comprehensive Introduction to Dif- 
ferential Geometry (Publish or Perish, Inc, 5 vol.), Shlomo Sternberg, Lectures 
on Differential Geometry (Prentice-Hall, 1964), Serge Lang, Fundamentals of 
Differential Geometry (Springer, 1999). 
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neighbourhood of (a, b,c) are, by definition, those that can be expressed in a 
differentiable manner using its coordinates x, z. Etc. Thus, X can be written 
as the union of six open subsets. On each of them there is a special homeo- 
morphism onto an open subset of R?. Each of these homeomorphisms makes 
it possible to give a reasonable definition of C” (r < co) functions on the 
corresponding open subset of X. 

This definition would, however, be insignificant if it gave incompatible 
definitions of differentiability on the intersections of these open subsets; it 
is not at all so. For example, take an open set U, where both {z > 0} and 
{y < 0} hold; using the hemisphere z > 0, the relation f € C’(U) means 
that f is a C” function of (x,y); using the hemisphere y < 0, it means that 
f is a C” function of (a,z). Hence it suffices to show that (x,y) is a C™ 
function of (x, z) on the open subset {z > 0}M {y < 0} of X, and conversely. 
This is obvious since 


y= (1-2? —ey and z= (1 -x mae 


with 1— 2? — 27 >0and1-—2?-y’>0. 

This leads to a coherent definition of functions of class C™ on a random 
open subset of the sphere. 

It can also be formulated more directly. First, given a C” function of 
(x,y,z) on an open subset V of R?, it restriction to U = VX is of class C” 
in the previous sense. Conversely, consider a C™ function f on an open subset 
U of X and take a neighbourhood of a point (a,b,c) € U. If, for example 
c > 0, the definition shows that in the neighbourhood of (a,b,c), f is a C” 
function of (x,y) defined in the neighbourhood of a point (a,b) of R?. The 
composition of f and the projection (x,y,z) (x,y) of R® onto R? is a CT 
function of (x, y, z) on a vertical cylinder having an open subset of the plane 
(x,y) as base. The restriction of this function to X is equal to the given 
function f on a neighbourhood of (a,b,c). In conclusion, a fuction f defined 
on an open subset U of X is of class C” if and only if, in the neighbourhood 
of every point of U, it has a C” extension on an open subset of R°. 


(ii) The notion of a manifold of class C’ and dimension d is obtained by 
generalizing the construction of C” functions on the sphere. 

To start with, a manifold X is a separated topological space. Hence there 
is acategory of sets in X called open and satisfying the two obvious conditions 
(any union of open sets is open, the intersection of a finite number of open 
sets is open), as well as Hausdorff’s condition: if a and b are two distinct 
points of X, there are open disjoint subsets U and V containing a and b. 
Topological spaces are the natural realm of the notion of continuity: a map 
f : X —+ Y is continuous if and only if the inverse image f~'(V) of any 
open subset V of Y is an open subset of X. 

A differential manifold X must also be a locally Cartesian topological 
space: for each a € X, there must be an open neighbourhood U homeomor- 
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phic to an open subset of some space R¢, where d is a given integer,” the 
dimension of X. Such a homeomorphism y = (y!,...,¢%), where the v(x) 
are the canonical coordinates of y(x), is, by definition, a local topological 
chart (U, y) of X which makes it possible to identify points x € U by using d 
real scalars €’ = y'(x), its coordinates in the chart considered.** The integer 
d is uniquely determined thanks to a well-known theorem (J. L. E. Brouwer) 
according to which an open subset of R? can be homeomorphic to an open 
subset of R¢ only if p = q. Peano’s curve (p = 1,q = 2) is not a homeomor- 
phism. 


Fig. 11.5. 


This definition supplies the topological or C° manifolds: only continuous 
functions can be reasonably defined on them. To turn X into a manifold of 
class C”, the charts admitted need to be selected. Like in the case of the 
terrestrial sphere, a possible method is to take an atlas of class C” of X, 
ie. a finite or infinite family of topological charts (Up, yp) covering X and 
pairwise C’-compatible : for all p and q, there is a C” map 4p, (in the usual 
sense) from the open set Yp(Up,) to the open set yg(Upq) taking y,(x) to 
Yq(x) in Up 1 Ug = Up: 


(11.1) Yq(z) = Inq [Yp(z)] : 


Changing the roles of p and q, it follows that 6,, and @,, are mutually inverse 
and hence are diffeomorphisms. 

For s < r, a function f defined on an open subset U of X will then be 
said to be of class C’* if, for all p, f(x) is a function of class C* from y,(x) 
to the open set UN Up. 


47 Some authors allow the dimension to depend on the point a. As it is locally 
constant, and so constant in each connected component of X, this generalization 
is of little interest. 

48 Hence, like any open subset of a Cartesian space, a manifold is locally com- 
pact. Apart from some rare exceptions, all manifolds encountered are unions of 
countably many compact and metrizable sets. 
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Manifolds have been defined using atlases, but the only thing that really 
matters in a manifold X is the set C”™(U) associated to all open subsets 
U c X of numerical functions defined and of class C”™ on U. As in the 
case of the sphere, these functions are characterized by a local property: if 
a function f is defined on an union U of open subsets U;, then f € C’™(U) 
if and only if the restriction of f to each U; belongs to C’(U;). Apart from 
its atlas, there are many other useful local charts in a manifold X defined in 
this manner, namely the charts of class C” that, for short, will be henceforth 
almost always called charts, or local charts, without further precision when 
there is no ambiguity. These are the topological charts (U, wy) such that, for 
any open subset U’ Cc U, the homeomorphism vy transforms C’(U’) into the 
set of C” functions (in the usual sense) on the open subset y(U’) of R¢. In 
other words, a function defined on U’ is of class C” if and only if it is a C” 
function in the classical sense of the coordinates €* = y'(x). In particular, the 
coordinate functions y'(x) must be of class C” on U, but choosing randomly 
d functions in C’(U) is obviously not sufficient to obtain such a chart. Loosely 
speaking, an open subset U of X will be said to be open Cartesian if it is the 
domain of a chart (U, y). 

If (U, py) and (V, ~) are charts of class C’ of X, the C” functions on UNV 
must be the same, whether they be given by € = v(x) or 7 = w(x). As in the 
above case of an atlas, this leads to the conclusion that, that the relations 
n = 0(€), € = p(n) must hold in UNV, ie. 


(11.2) p=pop, ~=Gop, 


where 0: p(UNV) — W(UNV) and p: V((UNV) — y(UNV) are of class 
C’, in other words are mutually inverse diffeomorphisms. The (formulas of) 
change of chart (or coordinate) will be denoted p and 6. I do so because this 
was the notation used by the inventors of absolute differential calculus (n° 3, 
(ii)), who without knowing it were working in the theory of manifolds. 


(iii) Some Examples. Take E to be a d-dimensional Cartesian space and 
choose a basis (a;) for E. The point y(x) = €’e; € R®@ can be associated to 
each x = €'a; € E. Its canonical coordinates are those of x with respect to 
the basis of E. This gives a topological chart (Ey) for E which, on its own, 
is an atlas of E and so turns E into a manifold of class C°. The €’ undergo 
a linear transformation under a change of basis in E’. Clearly, the manifold 
structure thus defined on E does not depend on the choice of the basis (a;) 
and for any open subset U Cc E, C’(U) is for all r a set of C” functions on 
U in the classical sense. 

In the case of the unit sphere X discussed above, there are, among others, 
two charts, a priori of class C°, using the stereographic projection from the 
north or the south pole of X; they map the open complement U of this pole in 
X to R?. If we consider the projection from the north pole, then the function 
y is easily seen to be given by 


p(x, y,2) = [x/(1— z),y/( — 2) 
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with x? + y? + 2? = 1. But considering any of the six local charts defined 
above, the corresponding coordinates are two of the variables x, y, z, the third 
one being, as we have seen, a C'™ function of the other two. The chart y is, 
therefore, C°°- compatible with the atlas initially used since z 4 1 outside 
the north pole and conversely. Hence the differentiable structure of S' could 
have been defined by using these two stereographic projections. 

To find an example of an “abstract” manifold, i.e. which is not given as 
a subset of a Cartesian space, consider the projective space X = P,,(IR); by 
definition, it is the set of 1-dimensional vector subspaces (lines with initial 
point the origin) of R"*!. It would amount to the same to establish the 
equivalence relation “x and y are proportional” on the set of non-zero vectors 
of R"*+ and to say that P,(IR) is the quotient space of R"t!' — {0} modulo 
this relation would come to the same. An element of P,(R) is, therefore, 
characterized by n+1 numbers x#°,...,2”, not all of which are zero, defined 
up to a factor; these are its homogeneous coordinates . Any function f defined 
on a subset of P,,(IR) can be identified with a homogeneous function of these 
coordinates, i.e. such that 


F Ea ies”) =F nie) for all t 40. 


If p(x) denotes the image of « € R"t! — {0} in X = P,,(R), ie. the 
subspace D generated by x, then a topology can be defined on X by requiring 
U c X to be open if and only p~1(U) is open in R"*+ — {0} or, equivalently, 
if the union of the lines D € U is an open subset of R"*t! — {0}. In particular, 
X is the union of the n + 1 open sets U;(0 <4 <n) that are images under p 
of the open subsets of R"*! defined by «? 4 0. As p(x) = p(a/a"), U; = p(E;) 
also holds, where E; is the hyperplane with equation €’ = 1. U; is, therefore, 
the set of lines D having non-trivial intersection with FE; and as such a line is 
determined by its (unique) intersection point with E;, the map p: EF; —> U; 
is bijective. This leads to an inverse map q : U; —> E; which takes every 
line D € U; onto its intersection point with E;. For all D € U;, 


GAD) SP erg Tyas 8") 


with well-defined €? = x? /x', so that the formula 


pi(D) = (, Cee Fe as vas Fiat 


gives a bijection from U; onto R”, which together with its inverse is obviously 
continuous. Hence the couples (U;, y;) thus defined are charts of X of class 
C® covering P,,(R). In fact, they are pairwise C°-compatibles. Indeed, for 
D€U,NU;, there are relations of the form 


qi(D) = (o neige Ibe uae) r} 
CVS aw) sy pay 


these points of R"+! being on D, €* = n*/n* and n* = E*/€ for all k. 
As U; Uj corresponds to y € E; such that 7’ # 0, the coordinates y'(D) 
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in U; NU; are C® (or even rational) functions of the coordinates ¢/(D) 
and conversely. The result follows, and this atlas defines a C'° manifold 
structure on P,,(IR). C” functions on an open subset U Cc P,,(R) are just 
the homogeneous functions of class C” (in the usual sense) on the open set 
p*(U). 

Exercise. Check Hausdorff’s axiom. 


Replacing the 1-dimensional subspaces in the previous construction by 
p-dimensional ones for given p, it is possible to generalize. But it is slightly 
less simple and we leave it to the reader to give detailed proofs. Let E be 
an n-dimensional Cartesian space and 92 Cc E? the set of sequences « = 
(%1,...,%p) of p linearly independent vectors in E; it is an open subset of 
E”, Any p-dimensional subspace H of E is generated by the “components” 
x, of some z € 2 and x,y € {2 generate the same subspaces if and only if 
there is a matrix (g?) € GL,(R) such that y; = g}x,;. This is an equivalence 
relation, so that the set X = G,(E) of p-dimensional subspaces of E is the 
quotient of 2 modulo this relation. If p : (2 —>+ X denotes the obvious map, 
we then get a topology on X as in the case p=1:U C X is open if and only 
if so is p-'(U) in Q (or E?). Having done this, if U C X is open, C*°(U) is, 
by definition, the set of functions f for which f op is C® on the open subset 
p ‘(U) of Q. There are verifications to be done and charts to be found; we 
proceed as follows. 

For this, choose a (n — p)-dimensional vector subspace F' in E and let Xp 
denote the set of H € X such that HM F = {0} or, what amounts to the 
same for reasons of dimension, such that £ = F @ H, a direct sum. It is not 
hard to see that Xp is open in X. Then choose p vectors a; € EF generating 
a subspace Ho such that E = F' @ Ho. If H € Xf, the relation E = F 6 H 
shows that, for all 7, there is a unique xz; € F' such that a; — x; € H. Setting 
p(H) = (#1,...,2p) for all H € Xp, define a bijection from Xp onto F? —a 
linear algebra exercise — and hence a chart (Xp, y) for Xp, a priori purely 
set-theoretical. 

Couples (Xr, y) depending on the choice of F and of vectors a; are in 
fact topological charts pairwise C~°-compatibles and they define the p(n—p)- 
dimensional manifold structure of the grassmannian X = G/(E). 

The latter is compact . To see this, choose a Hilbert or Euclidean scalar 
product (x|y) on & and note that in all subspaces H of EF, there are bases 
x = (a1,...,£p), orthonormal with respect to it: (w,;|x;) = 1 or 0. As these 
relations are conserved when passing to the limit and prove that the x; are 
linearly independent, the set 2) C 2 of these systems is closed in E?. It is 
also compact since it is bounded. As p: (2 —> X is continuous and maps 129 
onto X, the compactness of X is obvious.*9 

Real or complex (replace R by C in what precedes) Grassmannians were 
invented in the 19th century in order to generalize of projective geometry 
and its real or imaginary “points at infinity ”. In former times, they used to 


49 For further information and other methods, see Dieudonné, XVI.11. 
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delight undergraduates, including the present author during his youth. We 
thus learnt that all circles in the plane, irrespective of their centre, go through 
the same two points at infinite, the “cyclic points” with homogeneous co- 
ordinates (1,2,0) and (1,—7,0). These manifolds play an important role in 
algebraic geometry and their topological properties have been extensively 
studied. 


(iv) Differentiable maps. In the same way that topological spaces are 
adapted to the general notion of a continuous map, manifolds are adapted 
to the notion of a differentiable map. It is all based on the following remark: 
given two manifolds X and Y and a map f : X —>+ Y taking the domain 
U of a chart (U,y) to X in the domain V of a chart (V,w) of Y, there is a 
unique map F’: y(U) —> W¥(V) such that 


wof=Fop 


on U; it is the map which, for any x € U, makes it possible to calculate 
the coordinates 7 of the point y = f(z) in the chart (V,~) in terms of the 
coordinates &* of x in the chart (U, y). We will sometimes say that F expresses 
f in the charts considered. Note that, if f is continuous at a point a of X, 
then for all charts (V,w) of Y at b = f(a), there exists a chart (U,y) of 
X at a such that f(U) C V, since f~'(V) is a neighbourhood of a, and so 
contains an open set containing a, which contains the domain of a chart of 
X at a. Because of this trivial observation, it is possible to generalize the 
definitions and results relating to Cartesian spaces to maps from a manifold 
to another, provided we check they have are well-defined independently of 
the charts used, a fact generally immediate. 

For example, given two manifolds X and Y of class at least C”, f will 
be said to be of class C” if f is continuous and if, for any charts (U,y) and 
(V,w) such that f(U) C V, the function F' expressing f in these charts is of 
class C” in the usual sense. This means that, for any open subset W of Y 
and any function g € C’(W), the composite function go f, defined on the 
open subset f—!(W), is of class C” on it. If, moreover, f is a homeomorphism 
and if f—! is of class C’, f is said to be a diffeomorphism of class C™ from 
X to Y. The reader will easily check that the map X —> Z obtained by 
composing two maps X —> Y and Y —> Z of class C” is also of class C”. 

Maps of class C™ are also called homomorphisms of manifolds, in line 
with homomorphisms of groups, rings or vector spaces, etc. in algebra. The 
Grothendieck school censored the controversial prefix “homo” and invented 
the term morphism, which the Greeks would have probably considered doubly 
barbarian.°? 

This definition makes the characterization of open Cartesian subsets of X 
possible. First, any open subset U of X is itself a manifold since there are C™ 
functions on any smaller open subset. In particular, any open subset U of a 


°° Barbarian: a foreigner, with respect to the Greeks and Romans (Littré). 
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space R®@ is a manifold if C” functions are defined on it. Having said that, U 
is an open Cartesian subset of a manifold X if and only if, as a manifold, U 
is diffeomorphic to an open subset of R¢; if, moreover, (U, y) is a chart, then 
y is a diffeomorphism from U onto the open subspace y(U) and conversely. 
Proofs reduce to rewording exercises. 

Many other such trivialities can be found in detailed presentations, in 
particular in volume 3 of Dieudonné’s Eléments d’analyse which, fortunately, 
gives frequent illustrations in the form of examples or exercises that are 
a great deal harder than the soporific but crucial definitions, scholia and 
sorites?! that we end up learning through repeated use. 

In particular, it is possible to define the notion of a product manifold : take 
two manifolds X and Y of class C’, and as (U,y) and (V,w) are charts of 
X and Y with values in R? and R¢, consider the map (x,y) + (y(2), v(y)) 
from U x V to R?*4. Making these charts vary gives an atlas of class C” 
for X x Y, whence a manifold structure on the topological space®? X x Y. 
You will have no difficulty in showing that a map z +> (f(z), g(z)) from a 
manifold Z to X x Y is of class C” if and only if so are f : Z —> X and 
g:Z—Y, or that the projections X x Y —> X are X x Y —> Y are 
of class C™. And many more wonders... One may laugh, but this is what 
transformed the loose theory available in the 1930s to a perfectly clear and 
precise mechanism, whose concepts often suffice to indicate the notions that 
should be introduced and the theorems that should be proved, at least at an 
elementary level: they only need to well-defined. 


12 — Tangent vectors and Differentials 


(i) Vectors and tangent vector spaces. In what way can the calculations of the 
preceding §§ be generalized to a d-dimensional manifold X, for example the 
notion of a differential form? We instantaneously encounter a fundamental 
difficulty: it is possible to talk about “vectors” and “linear forms” in a 
Cartesian space, but there are no vectors in a “curved” space, not even in a 
sphere in R*. To bypass this obstacle, associate to each a € X an “abstract” 
vector space having the same dimension d as X, called the tangent vector 
space to X at a and which I will denote X’(a), other authors adopting other 
conventions, for example T,,(X) which I will sometimes use. 


°! Scholium : In philology, a grammatical or critical note explaining classical texts. 
In geom. A remark on several propositions made in order to show their link, re- 
striction or extension. Sorite : Sort of argument, in which a series of propositions 
is so arranged that the second one must explain the predicate of the first one, 
the third one the predicate of the second one, and so one, until the conclusion 
wanted is reached. Predicate: In log. and gram. What is denied or affirmed of 
the subject of a proposition. In the proposition: All men are mortals, mortals is 
the predicate. (Littré). 

If X and Y are two topological spaces, ordaining a subset of X x Y to be open 
if and only if it is a union of sets U x V, where U and V are open in X and Y 
defines a topology on X x Y. 
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To arrive at a definition of X’(a), what is more generally called a tensor 
of type (p,q) at the point a can first be defined by drawing on the Italians. 
A priori, we do not know what the concrete nature of such an object is, 
but we suspect that it must have “components” in each local chart (U, vy) 
at a; for example, if (p,q) = (2,1), these components should be numbers 
TE (y) depending on three indices and on the chart considered. Having ad- 
mitted this, we stipulate that these components should transform according 
to formula (3.9) of §1 when (U,y) is replaced by another chart (V,w) at a: 
if coordinates € = v(x) and 7 = (x) of a variable point « © UNV are 
connected by 


n=O6(€), €=pl(n), 


where 6 maps yp(UNV) diffeomorphically onto w(V NU) and conversely, then 
this gives formulas 


dn®* = O8(€)dé', dé" = pi (n)dn® 
with partial derivatives 
(12.1) O8(€) = dn*/d&", p(n) = dé" /dn®. 


Having said this, the numbers Tia () corresponding to the chart (V, ¢) must 
satisfy the relation 


(12.2) TJ g(¥) = p(n) o9(m)Or (OTE (Y) ; 


where the coefficients are calculated at the points € = y(a) and n = (a). At 
this stage of the definition, we are reduced to “absolute differential calculus” : 
we do not know over what we are calculating, but we calculate. We continue 
doing so every day in our times, and not only in physics... 

Then the tangent vectors to X at a are, by definition, the tensors of type 
(0,1) at a. We thus get some h € X’(a) by taking, in each local chart (U, y) 
at a, numbers h'(y) that are subject to the equivalent relations 


(12.3) he () = OP(E)R(Y), h'(p) = pa (m)A*(W) 


for any local chart y and ~ at a. However, the 09(€) are the entries of the 
Jacobian matrix with respect to the canonical basis for R¢ at the point y(a), 
of the chart change®® 


0: p(a) > Y(@). 
Hence, setting 


(12.4) h(y) = h'(y)ei, 


53 The notation below replaces the formula 6[y(x)] = w(z). 
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where (e;) is the canonical basis of R2, 
(12.5) h(w) = O'(E)h(y) . 


takes h(y) to h(a). This is the formula that we will always use. The coor- 
dinates of the infinitesimal vector connecting the points « and # + dx of X 
in the charts y and w being dé and dy = 0’(€)dé, formula (5) says that the 
components of a tangent vector must transform as those of dx, despite the 
fact that the expression x + dz is not well-defined in a curved space. 

To construct a tangent vector it then suffices to choose h(y) arbitrarily 
in a particular chart at a and to define h(q) in the others by (5); it is 
nonetheless necessary to check that (5) still holds when passing from a chart 
2 to a chart ys. But if the Jacobian matrices at a of the chart changes 
p(z) 4% Yo2(x), ya(x)  vs(x) and v(x) 4 ~3(a) are temporarily denoted 
by Mi2(a), Mo3(a) and Mj3(a), then by construction, 


h(p2) = Miz(a)h(y), h(y3) = Mis(a)h(y) ; 
it therefore suffices to show that 
(12.6) Mi3(4) = Mo3(a)Mi2(a) ; 


up to notation, this is the multivariate chain rule. 

The set X’(a) of tangent vectors to X at a being thus defined, it can be 
turned into a vector space by, for example, defining the sum h = h’ + h”’ 
of two tangent vectors by h(y) = h’(y) + h” (py): we do the necessary for 
the map h +> h(y) from X’(a) to R@ to be linear in any chart valid in the 
neighbourhood of a. As it is bijective, 


dim X’(a) = dim X . 


Then, as in the case of a Cartesian space [§ 1, n° 3, (i)], a basis (a;()) for 
X’(x) can be associated to any local chart (U,y) and any x € U,°* € = (2): 
the basis which corresponds under h ++ h(wy) to the canonical basis (e;) of 
R?¢. As h(y) = h'(y)e; for any h € X"(z), 


(12.7) h = h'(y)ai(€) 


for all h € X'(ax), so that the h’(y) are now the coordinates of h with respect 
to the basis (a;(€)) for X’(a). Elie Cartan, who knew all this intuitively, took 
advantage of it to define Leibniz’s infinitesimal vector dz: letting €' + d&* be 
the coordinates of a point “infinitely near” the point x with coordinates €’, 
set 


°4 This notation has the inconvenience of not specifying the chart y used, but the 
notation a;(y, a) is too cumbersome. The person who will succeed in introducing 
in differential geometry a notation system perfectly coherent and comprehensible 
in all cases is probably not born yet. See the notation index in Dieudonné. 
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(12.7’) dx = a;(€)dé’. 


If the chart is changed, the a;(€) and dé’ are inversely transformed, so that the 
vector dx has an “absolute” meaning, at least metaphysically. Elie Cartan 
also used to speak of the point x + dx of X but, as mentioned above, this 
reaches the limit of acceptable misnomers when X is not contained in a 
Cartesian space, and even in this case. 

The construction of X'(a) makes it possible to reduce the definition of 
tensors at a given above to that of §1, n° 1, (ii). First, covectors at the point a 
can be defined either as elements of the dual X’(a)* of X’(a), or as tensors of 
type (1,0) having in each (U, y) at a components u;(y) whose transformation 
is given by the relation 


Ualb) = py(m)ui(y) - 


Indeed, comparing with the transformation formula (3) shows that the num- 
ber u(h) = ui(y)h(y) is independent of the chart vy, and so defines a linear 
functional u on X’(a), and (4) then shows that 


ui(p) = ulai(§)] - 


Now, if there is a tensor T of type (2,1) for example, the transformation 
formula (2) shows that, for h,k € X‘(a) and u € X'(a)*, the expression 


T(h, k;u) = Ti (p)hi (y)k (y)ue() 


is independent of y, and so has an “absolute” meaning. Hence, tensors at a 
defined in the manner of the Italians of 1900 are merely tensors on the vector 
space X’(a) in the sense of § 1, n° 1, (ii). We now feel like we know what we 
are talking about, but we have in effect just expressed the founders’ concepts 
in modern algebraic language. 


(ii) Tangent vector to a curve. A simple method for constructing a vector 
h € X'(a) is to use a path or a curve pw: J —> X, where I C R is an open 
interval containing 0, with (0) = a. Assuming that the R¢-valued function 
y|u(t)| is differentiable at ¢ = 0 in a local chart (U,y) at a, it is possible to 
set 


(128) a(y) = tim PMOL PHO) _ eof) for t=0, 
where D = d/dt. The multivariate chain rule immediately shows that condi- 
tion (5) holds. This gives some h € X’(a), which we write j:’(0), an expression 
that should not be confused with lim|(t) — u(0)]/t, as this is not well-defined 
except when X is contained in a Cartesian space, and as in this case, it is not, 
strictly speaking, a tangent vector to X, in the more abstract sense adopted 
here; we will return to this later. It would be natural to say that p’(0) is 
the tangent vector to w at the point a — mechanical engineers would talk of a 
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velocity vector —, provided, once again, of not letting classical, but misleading 
images mystify us. 

Any tangent vector at a € X can be obtained in this manner: if h corre- 
sponds to the vector h(y) € R@ in the chart (U,), it suffices to choose the 
map 


y [u(t)] = pla) + th(y) 


as it. In particular, the vectors a;(€) of the basis for X’(a) correspond to the 
trajectories t > (a) + te; in R%, where (e;) is the canonical basis. Denoting 
the inverse map of y by f : p(U) —> U so that, for all € € y(U), f(E) is 
the point of X whose coordinates in the chart considered are precisely the 
canonical coordinates €’ of the point €, the corresponding curve p is obviously 
the map 


tro f(é,...,€4+4,...,€9, 


where € = (a). The basis for X’(a) associated to the chart considered is 
obtained by calculating the tangent vectors to these supposed “curvilinear 
coordinate axes” at t = 0. 

Suppose, for example, that X is an n-dimensional Cartesian space E, 
and hence isomorphic but not identical to R”. The simplest charts for E 
are obtained by choosing a basis (a;) for E and by associating to each 7 = 
£'a; € E the point v(x) = €'e; of R”. If (ba) is another basis for E, we get 
the formulas b, = g’,a; with an invertible matrix (g‘,); to calculate (a) for 
the new chart, set x = 7%ba, and so, by definition, w(a#) = n“e,. Then, as 
&' = gin, formula (1) can be applied with p‘,(m) = g’,. Then consider a 
vector h € E’(x), where x is an arbitrary point of F. It has corresponding 
vectors 


h(y) =hi(p)ex, Ah) = h*(W)ea 


in the charts (F,y) and (£,~) thus constructed. The general formula (3) 
shows that h'(y~) = g,h°(w), which means that 


hi (pa; = h* (hb) gua; = h*(W)ba- 


So E’(a) can be canonically identified with the vector space F itself, by 
the map h+> h'(y)a; which, in conformity with Italian mechanics, does not 
depend on the chosen basis. 

Conversely, the simplest among the curves js running through some x € E 
are the lines t}+> «+th, where h € E is given. Hence, each h € FE canonically 
defines an element of E’(x). It would be extremely surprising if the map 
E —>+ E'(x) thus defined was not the inverse of the one obtained by using 
the bases for E; we leave it to the reader to check this. It is for good reason 
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that Leibniz, a philosopher mathematician, believed in the existence of a pre- 
established harmony®® governing Creation and hence differential geometry. 


(iii) Differential of a map. In the general case, the construction of X'(a) 
makes it possible to define the differential df(a) of a numerical function differ- 
entiable at a. For this, choose a chart (U,y) at a, write that f(a) = F[y(x)] 
where the expression F' of f in the chart considered is differentiable at 
€ = (a), and, for all h € X'(a), set 


(12.9) df (a; h) = dF [E; h(y)] = DiF(E)h'(¢), 


where D; = d/dé'. The multivariate chain rule immediately shows that, under 
chart change, the D;F(€) and h'(y) are inversely transformed, in other words, 
that the D;F(€) are transformed like the components of a tensor of type 
(0,1); so the left hand side does not depend on the chosen chart, which 
legitimizes definition (9). In the particular case of the coordinate function 
x++ y*(ax), the function F is (€1,...,€4) + €’, and so 


(12.9’) dy'(a;h) = hi(y). 
Like in a Cartesian space, 


(12.9”) df (a) = Djf(a).d&" 


is a shorthand version of formula (5), where D; f denotes the derivatives of f 
considered as a function of local coordinates €’ = y'(x) and the differential 
dy'(a;h) is shortened to dé. This gives a linear functional on X’(a), i.e. a 
covector df (a) at a. It is obvious that 


p= fg => dp(a;h) = df(a;h)g(a) + f(a)dg(a;h) 
for all h € X'(a). 


More generally, if f is a map from a d-dimensional manifold X to a n- 
dimensional manifold Y, for all a € X, a tangent linear map 


(12.10) f'(a) : X"(a) — Y’(b), 


where b = f(a), may be defined, obviously by making a differentiability 
assumption about f. As always, the method is imposed by the data of the 
situation. Indeed, the aim is to associate to each vector h € X’(a) a vector 
k € Y'(b). For this, choose charts (U,y) and (V,w) of X and Y, with a € U, 
be V, f(U) CV, v(a) = € and w(b) = 7. The construction of tangent spaces 
then furnishes us with bijective maps from X‘(a) and Y’(b) to R4 and R” and 


°° Pre-established harmony is a theory of Leibnitz according to which the spiritual 
and the physical world are like two perfect, but independent, clocks, always 
indicating the same time. Littré. 
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with a vector h(y) € R®. To define k, a vector k(7) € R” needs to be derived. 
Hence we only lack a map from R¢@ to R”, which must be linear in order for 
the map X’(a) —> Y’(b) sought to be so. But in the charts considered, f 
can be expressed by a C” map 


F:9(U) > WV) 


such that F(€) = 7; it has a tangent linear map F’(€) : R4 —> R” at €. 
Hence 


(12.11) k(~) =F'(g)h(y) where € = y(a) 
is the unique conceivable solution of the problem; or in coordinates, 
(12.11) Ki? (w) = DiF?().hi(y). 


Nonetheless, as always, some verifications need to be made to show the 
“absolute” character of this construction using charts. The simplest is to 
observe that, in relation to charts, the derivatives D;F'? behave in (11’) like 
a covector at a does in relation to the index 7 and like a vector at b = f(a) 
in relation to the index p; as the formula respects Einstein’s conventions, it 
has an absolute character. .. 

Another way of defining f’(a) is to use the construction of tangent vectors 
by curves, presented in (ii). If h € X’(a) is a vector p’(0) of a curve pw 
drawn in X and such that u(0) = a, then the image of yw under f is a 
curve v(t) = f[u(t)] drawn in Y and such that v(0) = b. There is vector 
v'(0) = k € Y'(b) corresponding to the latter ; it is just f’(a)h. Indeed, in the 
situation used above to define f’(a)h, by (8), the vector h is represented in 
the chart (U,y) by the derivative vector h(y) at t = 0 of the function yo 1; 
the vector k € Y'(b) is likewise represented in the chart (V,w) by the vector 
k(w), the derivative at t = 0 of the function wo v; as 


pov=po(fou)=(pof)ou=(Foy)op=Fo(poy), 


the classic multivariate chain rule shows that the derivative at t = 0 of pov 
and of yo yz are connected by the formula k(W) = F’(€)h(y), which reduces 
to definition (11). 

Consider, for example, the case of a Cartesian space Y = E. At the end of 
(ii) above, we saw that tangent spaces to EF can be canonically identified with 
E; f'(a) can, therefore, be interpreted as a linear map from X’(a) to E. To 
write it explicitly, start with some h € X’(a) defined by a curve p(t) such that 
(0) = a; so the image f’(a)h € E"(b), where b = f(a), is defined by the curve 
tr» f[u(t)] = v(t) in E. However, under the above identification of E’(b) with 
E, f'(x)h becomes an ordinary vector v'(0) = lim[y(t) — v(0)]/t; hence, if, 
as appropriate, no difference is made between the “abstract” vector f’(a)h 
and the “concrete” vector corresponding to it in FE, it can be computed by 
the relation 
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(12.12) fiah=—flp(t)] for t=0. 


Even more particularly, suppose that E = R¢, where d = dim(X), con- 
sider a chart (U,y) of X in the neighbourhood of a and take f = y, so that 
f'(a) is bijective. This leads to the derivative at the origin of the function 
y|u(t)] ; but by (9), it is precisely the element h(y) of R? which defined h in 
the chart (U, y~). Hence 


(12.13) gy (a)h = h(y) 


for all h € X'(a) and all charts (U, y) in the neighbourhood of a. 

This is not surprising. Indeed, the aim is to find a preferably natural 
way to transform every h € X‘(a) into a vector of R? by using y. Now, 
the definition of tangent vectors itself provides us with such a vector, namely 
h(y). What other possibilities are there? The preestablished harmony in these 
domains could have saved us a proof... 

The multivariate chain rule can be expressed in the language of manifolds. 
For this, consider homomorphisms f: X —> Y,g:Y —Zandp=gof: 
X —+ Z; the linear maps f’(a) : X’(a) —> Y'(b), where b = f(a), and 
g'(b) : Y'(b) —+ Z'(c), where c = g(b) = p(a), are available at a € X. Since 
the aim is to find a linear map p’(a) : X’(a) —> Z’(c) that can de deduced 
naturally from the data, it leads us to believe it is 


(12.14) p'(a) =g'(b)o f'(a). 


V w(V) 
F ~~ 
Vv Vv 
@(U) E > 1(W) 
Fig. 12.6. 


The reader unhappy with this philosophical and theological argument can 
always read Dieudonné (vol. 3, p.24): “It is an immediate consequence of 
the definitions of the theorem (8.2.1) on composite functions.” In fact, the 
full demostration would consist in using the charts (U,~), (V,w) and (W,7) 
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of X, Y and Z at a, b and c, in drawing the diagram of the nine maps 
involved in the question: f, g and p, y, w and 7, and the maps F’, G and P 
expressing f, g and p in the charts considered, and in applying repeatedly the 
definitions and the multivariate chain rule: an excellent exercise to understand 
the mechanism of manifolds. 

Because of the definition of f’(a), as in the classic situation (§ 1, n° 2, (v)), 
it is possible to define the rank of f at a: it is the dimension of the image 
subspace Y'(b) of X‘(a) under f’(a), i.e. the rank of the linear map f’(a) ; 
if F expresses f in the local charts at a and b = f(a), the rank of f at a is 
clearly equal to that of F' at the point € corresponding to a. It cannot exceed 
dim(X), nor dim(Y); if it is equal to dim(X), then f’(a) is injective and we 
have an immersion; if it is equal to dim(Y), f’(a) is surjective and we have 
a submersion. In the neighbourhood of a point a, the rank of f is at least 
equal to its rank at a, so that the rank of an immersion or of a submersion is 
constant in the neighbourhood of a. Maps with this latter property are the 
subimmersions. They are characterzed by Theorem 1 of § 1, n° 3, (i), whose 
generalization to manifolds is obvious. 


(iv) Partial differentials. The tangent space Z’(a) at a point c = (a,b) 
of a product manifold Z = X x Y is easily determined. Indeed, in this case, 
there are projections pr; : Z —> X and pra: Z —>+ Y given by (x,y) > « 
and (x,y) +> y; so their tangent maps define maps Z’(c) —> X'(a) and 
Z'(c) —> Y'(b), whence a map Z’(c) —> X'(a) x Y’(b). As local charts 
reduce the case to one where X and Y are open subsets of Cartesian spaces, 
this map is clearly linear and bijective. We make no distinctions between 
Z'(c) and X'(a) x Y'(b). If h € Z'(c) is defined by a curve 


t+—> w(t) = (Hi (t), we(t)) , 


its images in X’(a) and Y’(b) are defined by the curves py and po. 

Now, let X,Y,Z be three manifolds and f : X x Y — Z a homo- 
morphism. We compute its tangent map at (a,b). If c = f(a,b), it maps 
X'(a) x Y'(b) to Z'(c), and if h € X’(a), k € Y'(b) are defined by the paths 
y(t) and 6(t) with initial points a and b, the image of (h, k) is defined by the 
path tr f[y(t), 6(t)]. As (h, &) = (h,0) + (0,4), it suffices to add the images 
of (h,0) and (0,4). The first one is defined by the path t 4 f[7(¢), 6]. Thus 
the tangent map to «++ f(x,b), which will be denoted f{(a,b) or di f(a, b), 
as well as the tangent map f{-(a,b) or dof(a,b) to y +> f(a,y) need to be 
considered. This implies that f’(a,b) is the map 


(12.15) f'(a,b) : (hk) 4 fie (a,b) + fi-(a, b)k. 


The relation with the formulas of n° 2, (iii), in particular (2.24), is clear; 
besides, thanks to local charts, (15) reduces to (2.24). 

(v) The manifold of tangent vectors. Denote by X’ or T(X) the set of 
all tangent vectors to X, i.e of couples (x,h) with « € X and h € X'(x). By 
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” 


associating to each h € X'(x) its “initial point” 2, we get a canonical map 
p:T(X) —> X. The set T(X) can easily be made into a manifold. First, if 
(U,y) is a chart of X, the inverse image p~'(U) is the set T(U) of tangent 
vectors to the manifold U ; ifn = dim(X), the map y’ : (2, h) +> (v(x), h(y)) 
from T(U) to the Cartesian product y(U) x R” is bijective. Formulas for 
chart change of section (i) of the present n° show that, if (V,w) is another 
chart of X and if the chart change is of class C” on UNV, then the image of 
(p(x), h(y)) is obviously (2(x), h(w)) under a C’~! map. This gives a C"~! 
structure on T(X) = X’: the open subsets 2 of T(X) are defined by the 
condition that, for any chart (U,y) of X, the image of 2M T(U) under vy’ is 
an open subset of y(U) x R”, so that the couples (T(U), y’) become charts 
of class C® for T(X); as they constitute an atlas of class C’~! for T(X), the 
definition of the manifold T(X) follows. 

To any homomorphism f : X —> Y of manifolds can be associated a 
homomorphism f’ : X’ —> Y’, namely 


(12.16) f': (ah) > (f(a), f’(a)h) . 


If there is another homomorphism g : Y —> Z, then setting p = gf, the 
multivariate chain rule shows that 


(12.17) p=gof'. 


Indeed, f’ maps (x,h) to (f(x), f(x)h), which is then mapped by g’ to 
(g(f(x)), o' (f(x) f' (a)h) . It, therefore, remains to check that p(x) = g(f(2)), 
which is trivial, and that p(x) = g/(f(x)) o f’(2). 

For a product manifold Z = X x Y, there is a canonical isomorphism 
Z' = X' x Y’ since, for « € X and y € Y, the tangent space T(z,y)(Z) was 
identified with T,(X) x T,(Y). 

All this is too easy though sometimes convenient, especially in Lie 
group theory. Explaining the structure of the manifolds T(T(X)) = T?(X), 
T(T(T(X))) = T?(X), etc. is, however, far less simple. Then, any homo- 
morphism f : X —+ Y has “extensions” f(") : T"(X) —> T"(Y) that are 
homomorphisms and for which formula (17) becomes 


plage 7h, 


Interpreted in classical terms this order r multivariate chain rule is also not 
obvious. For a start, try to understand the case r = 2. 

Exercise. Call an arbitrary basis of a space T,,(X) a frame. Construct a 
natural manifold structure on the set of frames of X. 
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The most immediately obvious manifolds are certain subsets X of a Cartesian 
space EF which, for simplicity’s sake, we will often suppose to be R”. There 
are several equivalent and equally important methods to equip them with a 
natural differentiable structure;°® these methods were conceived in the 17th 


Fig. 13.7. 


°6 Tt should not be thought that all possible methods for defining a manifold struc- 
ture on a given set, however familiar they may be, lead to the same result. First, 
a method to construct two different, though isomorphic, differentiable structures 
on a manifold X involves choosing a non-differentiable homeomorphism o from 
X onto X and to declare that the differentiable functions for the second struc- 
ture are those obtained by composing o with a differentiable function for the first 
one. This procedure, which can, to start with, be applied in R, being in every- 
one’s reach, consider equivalent two such manifold structures. The real question 
is whether there are others. Some experts in algebraic topology (M. Kervaire 
and J. Milnor, Annals of Math., 77, 1963) have calculated the number vp, which 
happens to be finite, of non-equivalent C' structures that can be defined on the 
unit sphere of R”: 


n:<6 7 8 9 10 11 12 13 «214 15 16 «17 
Uy: 1 28 2 9 6 992 1 3 2 16256 2 16 


Others have explicitly described some of these bizarre structures that were un- 
known. C' structures essentially different from those of everyone else can also 
be defined on R* (but not on R” with n < 3). Finally, there are topological 
manifolds, i.e of class C°, on which no C? structure can be defined. Naive ideas 
are sometimes incorrect. 
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century by using Cartesian coordinates; the theorems used are, except for 
generality and language, known since the 19th century. Since the 18th, people 
have studied “curves” and “surfaces” in the plane or the space, sometimes 
defined by an equation y = f(a) in the plane (the parabola y = 2x? for 
example), sometimes by a parametric representation (the ellipse « = a. cost, 
y = b.sint for example), sometimes by a relation between coordinates (the 
sphere x? + y? + z? = 1 for example). More complicated curves and surfaces 
have quickly been considered, for example the planar strophoid with equation 


(x —1)y? + 2*(a+1) =0, 
which can also be obtained by the parametric representation 
a= (t?-1)/(? +1) , yota; 


there are two arcs of simple curves in the neighbourhood of the origin meeting 
at 0, with distinct tangents; this is an example of a singularity which does 
not fall within the framework about to be defined (fig. 7). 

The three classical methods recalled above fall within a more general 
pattern: take two manifolds X and Y and a map f : X —> Y and consider 
either the image f(X) C Y (the case of parametric representations), or, for 
some b € Y, the set of solutions of f(z) = b in X (submanifold defined by 
relations between coordinates). The graph of a function X —> Y falls within 
this general pattern either by considering it the image of X under the map 
xs (x, f(x)) from X to X x Y, either as the set of solutions of y— f(x) = 0. 
In this n°, we are going to define submanifolds and then show how some may 
be obtained by these methods — direct image of a map, inverse image of a 
point under a map — provided we restrict ourselves to subimmersions, i.e. to 
maps of constant rank. 


(i) Submanifolds. Let X be a subset of a manifold Y of class C” and 
dimension g. For any open subset U of X, let C’(U) be the set of functions f 
defined on U and satisfying the following property: for all a € U, there is 
an open neighbourhood V(a) of a in Y and a C” function on V(a) which is 
equal to f on UMV(a). The example of the sphere (n° 11, (i)) suggests that 
X should be called a submanifold of Y if and only if the topological space 
X has the structure of a manifold of a priori arbitrary dimension p, and for 
which the C” functions are precisely those that have been defined. 

A first consequence of this definition is that the identity map x +> « from 
X to Y is then of class C”: it follows from the definition of these maps. In 
fact, it is an immersion, and so p < q as expected since there is then a chart 
(V,w) of Y at any a € X — for convenience, it can always be supposed to 
be cubic — such that (XV) is the face of the cube W(V) defined by the 
relations €P*1 =... = €1 =0, i.e. such that 


(13.1) WVAX)=V(V)NR?, 
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as the restrictions yp’ to VM X =U of the functions y1,...,w? then define 
a chart (U,y) of X. The identity map X —> Y can, therefore, be expressed 
in these charts by 


(€7,...,€?) > (€7,...,€7,0,...,0) . 


This result means that, locally and up to a diffeomorphism, a p-dimensional 
submanifold of a g-dimensional manifold resembles a p-dimensional vector 
subspace in a g-dimensional vector space. On the other hand, as (V,~) is a 
cubic chart of Y, relation (1) shows the existence of aC” map p: V —> VO.X 
such that p(a) = x for alla ¢ VN X: for p, it suffices to choose the map 
expressed in (V, vw) by the projection 


(n',...,97) —+ (n’,...,7?,0,...,0) ER’. 


To prove (1), let us choose arbitrary charts (U,y) and (V,~) of X and Y 
at a € X, with y(a) = W(a) = 0. Since, in the neighbourhood of a, the y’ are 
restrictions to X of C” functions on an open subset of Y, replacing U and V 
by smaller open subsets, we may suppose that U = X MV and that there are 
g' € C’(V) such that vy’ = g’ in U. Denoting by u’ € C’(V’) the functions 
on V’ = 7)(V) expressing the g’, we get 


Cau [w" (2), ofc ,¥4(z)| for all EU. 


Since the restrictions of the y to U are of class C’, similarly there are v/ of 
class C” on the open subset U’ = y(U) of R? such that 


p(x) =v" [p'(z),...,y?(x)] forall eeU. 


As the maps u = (ul,...,u?): V’ — U’ and v = (v1,...,v2) : U’ SV! 
satisfy wov = id, u’(0)0v'(0) = 1, so that the map v’(0) is injective; since v 
expresses the map id : X —+ Y in the charts considered, id'(a) : X’(a) —> 
Y’(a) is also injective, which proves that id: X —> Y is an immersion. 

It remains to prove that it is possible to choose the chart (V,~) in such a 
way that (1) holds. This will follow from the standard from of subimmersions 
(§1, n° 3, (i), Theorem 1). Indeed, this theorem shows that, if there are 
manifolds X and Y of dimensions p and q and a map f : X —> Y of constant 
rank r in the neighbourhood of some a € X, then there exist a chart (U, vy) 
of X at a and a chart (V,w) of Y at b= f(a) such that f(U) CV, y(a) = 0, 
w(b) = 0 and in which the coordinates €’ = y'(x) of some x € U are changed 
into the coordinates 7) = W(y) of y= f(x) € V by the formulas 


(13.2) qf =f,...,.9 =f ,9 1! =... =n7=0. 


If X is a submanifold of Y, this result applies to the immersion f = id : 
X —+ Y, for which r = p. The condition f(U) C V is written U C V, so 
that U = X MV may be assumed by replacing V by a smaller open set. The 
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first p relations of (2) then show that the functions y’(1 < i < p) are the 
restrictions to U of the first p functions 74, and the following ones show that 
p(U) = W(V) OR?, which gives (1). 

Conversely, if for all a € X there is a chart (V,) of Y satisfying (1), 
then X is a submanifold of Y. Indeed, condition (1) shows that for any open 
subset U of VN.X, the f € C’(U), where C”(U) is defined for all open subsets 
of X like at the beginning of this section, are the C” functions of the first p 
coordinates 7)*(a) on the open subset W(U) of R? ; denoting by y’ € C"(U) 
the restrictions to U of these first p functions 7’, we get a chart (U,) for X. 
The charts of X obtained in this manner are pairwise C’-compatibles since 
so are the charts of Y used to construct them. Hence the result. 

A corollary is that any submanifold X of a manifold Y is open in its 
closure X, which means that X — X is closed in Y, or that a sequence of 
points a, € X converges to some a € X only if a, € X for large n. Indeed, 
let (V,w) be a chart for Y at a such that VMX is defined by relation (1) ; the 
a», are in V for large n and so are the limits of points of X NV; as (1) shows 
that ~(V NX) is closed in w(V), W(an) € WVNX), and soa, EVNX CX, 
qed. 

It goes without saying that if a chart (V,w) of Y at a € X is chosen 
randomly, the restrictions of the y to U = X NV do not constitute a chart 
for X; to start with, there are too many of them. But if X and Y have 
dimensions p and q, p functions forming a chart for X can be extracted from 
these q restrictions. Indeed, as shown above, there is a chart (U, ~) for X — it 
may be assumed to be defined on U by choosing V sufficiently small — such 
that the y’ are the restrictions to U of the first p coordinate functions of a 
chart (V,@) of Y, which can here too be supposed to be defined on V. The 6” 
being the C” functions on V of the wy, the y’, are C” functions on U of the 
restrictions to U of the y ; but since the y’ form a chart for X in U, these 
restrictions are also C” functions of the vy’. The p x q Jacobian matrix of the 
restrictions of the J with respect to the ¢y’ is, therefore, of maximum rank 
p, and hence extracting a non-zero determinant of order p from it gives the 
p functions sought. 

More generally, if X is a p-dimensional manifold and if, in the neighbour- 
hood of some a € X, there are p C” functions with non-zero Jacobian in 
a local (hence in all) chart, these p functions define a chart for X in the 
neighbourhood of a: this is the local inversion theorem. 

Finally, note that, if X is a p-dimensional submanifold of a q-dimensional 
manifold Y , by the canonical immersion from X to Y, for any « € X, 
the tangent space X'(x) can be identified with its image in Y’(a) under the 
linear map id’ (a) : X’(a~) —> Y’(a). In (ii) we will see how to determine this 
subspace of Y’(x) using local equations of X. 

Exercise 1. Let Y be a g-dimensional manifold and X a subset of Y 
equipped with a manifold structure such that the map x +> z is an immersion. 
Show that X is a submanifold of Y. 
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Exercise 2. Let Z be a manifold, Y a submanifold of Z and X a subset of 
Y. Show that X is a submanifold of Z if and only if its a submanifold of Y. 

Exercise 8. Let f : X —> Y bea homomorphism mapping a submanifold 
X’ of X to a submanifold Y’ of Y. Show that the map X’ —> Y’ induced 
by f is a homomorphism. 

Exercise 4. Let X be the union in R? of the coordinate half-axes {x > 0} 
and {y > 0} and P the point (—1,—1). For all M € X, let t the slope of 
the line PM, so that M +> t is a homeomorphism from X to R*. For any 
open subset U of X, let C°(U) be the set of functions on U that are C® 
as functions of t, which give a C° manifold structure on X diffeomorphic 
(under M ++ t) to R*.. Show that X is not a submanifold of R?. 

Exercise 5. A submanifold is open in its closure. With the help of exam- 
ples, show that it is not necessarily a submanifold. 


(ii) Submanifolds defined by a subimmersion. Let X and Y be two man- 
ifolds of dimensions p and q and f : X —> Y aC” map. First consider the 
set Z of solutions of f(a) = b for some given b € f(X) and suppose that the 
rank or of f is constant in an open subset containing Z (but not necessarily 
in all of X); Z is then a submanifold of X. 

Indeed, for all a € X, there exist charts (U, y) and (V,w) of Xand Y at a 
and 6 for which y(a) = 0, Y(6) = 0, f(U) C V and in which the map 


(13.3) F=ypofog?. 
Its rank which is constant in the neighbourhood of 0, is given by 
(13.4) PE pik ge Vee eae 8 pcan gO)" 


y(U N Z) is then defined by the relations £1 = ... = €" = 0, so the result of 
section (i) implies that Z is a submanifold of dimension p—r of X . 

If, moreover, as at the end of section (i), the space Z’(a) is identified with 
its image in X’(a) under id’(a), where id : Z —+ X, then’ 


(13.5) Z'(a) = Ker f’(a). 


This is the subspace of h € X’(a) such that f’(a)h = 0. Indeed, by definition 
of Z, the map f o7d from Z to Y is the constant map z++ b. Hence f’(a) o 
id'(a) = 0, and so Z’(a) C Ker f’(a). As f’(a) : X’(a) —> Y'(b) has rank r 
and X'(a) dimension p, 

dim Ker f'(a) = p—r =dimZ‘(a). 


°’ If u: E —> F is a linear map, Keru denotes the set of h € E such u(h) = 0, 
and Imu the set of u(h) € F. Then 


rg(u) = dim Im u = dim E — dim Ker u. 
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(5) follows. 

Example: The sphere in R?. The map from X = R* to Y = R given by 
f(x,y, z) = «7 + y? + 2? has rank 1 everywhere except at the origin where 
its three derivatives are zero. Hence, for R 4 0, equation f = R? defines a 
2-dimensional submanifold Z of R°. At every point (a,b,c) of Z, the subspace 
Z'(a,b,c) of X'(a,b,c) = X is the set of vectors (dx, dy,dz) orthogonal to 
(a, b, c) since 


df {(a, b, c); (dx, dy, dz)] = 2(adx + bdy + cdz). 


Another example: consider the orthogonal group G = O,,(R) C GL, (R), 
i.e. the set of n x n matrices satisfying g'g = 1, where g’ is the transpose of 
g. Taking X = Y = M,,(R) and f(a) = 2’x, a map from X to Y, 


df (a;h) = h'x + a'h 


so that, for given x € X, the kernel of f’(a) is the set of h € M,,(R) such 
that h’a + a’h = 0, ie. such that 2h = u is an antisymmetric matrix. If x 
is invertible, the map ut} 2’~!u is an isomorphism from the vector space of 
antisymmetric matrices onto Ker f’(x) ; the dimension of Ker f’(a), which is 
also the rank of f’(x), is, therefore, constant in the open subset GL,,(R) > G 
of M,,(R). So the group G is a submanifold of M,,(R), the vector subspace of 
M,,(R) tangent to it at g = 1 being also the kernel of 


hi— df(1;h) =h' +h, 


i.e. the set of antisymmetric matrices. 
Exercise 6. Consider M,,(C) as a real vector space. Let U,(C) be the 
group of unitary matrices, i.e. satisfying 


u*u=1 where u* =u! 
is the adjoint matrix of u (imaginary conjugate of the transpose). Show that 
U,(C) is a submanifold of M,,(C). 


Let us return to the general case and investigate the image Z = f(X) of 
f : X — Y by supposing that the rank of f is constant in all of X. For 
some b = f(a), relations (4) show that the image of f(U) under ~ is defined 
by 7+ =... = 74 = 0 this time, and so is a submanifold of Y. This would 
imply that f(X) is a submanifold of Y if f(U) were a neighbourhood of b in 
f(X); but this condition does not necessarily hold as will be seen in the next 
section. It is, therefore, prudent to suppose that f is an open map from X 
to f(X), ie. takes open subsets of X to open subsets of f(X). Then clearly, 
as a map from the manifold X to the manifold f(X), f is a submersion and 
the subspace of Y’(b) tangent to Z = f(X) is 


(13.6) Z (6) = Im f(a): 
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The most important case is that of an immersion or, as used to be said 
in the past, a submanifold defined by a parametric representation, a variable 
point of Z depending on the “parameter” x € X. In the best of cases, f is 
both an immersion from X to Y and a homeomorphism from X onto f(X) ; 
f is then said to be an embedding of X in Y or, in obsolete language, a 
parametric eigenspace representation of Z. To suppose that f is an open and 
injective immersion would be equivalent since f~! is then continuous. As, 
for any sufficiently small open subset U Cc X, f is a diffeomorphism from 
U onto the open subset f(U) of the manifold f(X) and as f is a global 
homeomorphism from X onto f(X), it is in fact a diffeomorphism from X 
onto f(X). 

H. Whitney showed that, given some inoffensive countability assumptions, 
any n-dimensional manifold admits an embedding into R?”. This well-known 
theorem is hard to prove — easy theorems rarely become famous — except 
in the fairly elementary case of compact manifolds;°> even Dieudonné, who 
proves it for R?"*+! (XVI.25, exercises 2 and 13) instead of R?”, retreated 
before the complete result. Doubts may arise as to the practical usefulness 
of this type of theorem since a useful embedding of a manifold into a Carte- 
sian space is generally one that can be explicitly constructed from the spe- 
cific data of the situation. If, for example, the universe happened to be a 
4-dimensional “curved” manifold, looking for an artificial embedding of it 
into a 8-dimensional Cartesian space would not be very useful, though with 
physicists. .. 


°8 If X is compact of dimension d, it can be covered by finitely many N charts 
(Up, %p) such that for all p, yp(Up) is the cube |é’| < 2 of R*; denoting by V, 
the set of 2 € Up for which yp(z) is in the cube |é"| < 1, the V) may be assumed 
to cover X. Now, there is a C® function h on R® equal to 1 for |€’| < 1 and 
to 0 for |é’| > 3/2 (for d = 1, see Chapter V, n° 29; the general case follows in 
an obvious manner). Replacing the U, by the V, and the vy, by the restrictions 
to V, of the functions h[yp(«)], X can be covered by the N charts (Vp, yp) for 
which the y, (extended by 0 outside U,) are defined and of class C” on all of X. 
The map 


t— (yi(x),.--,en(2)) 


from X to R?x...xR¢@ = R™@ then has rank d everywhere, but is not necessarily 
injective. To obtain a homeomorphism, use a partition of unity, i.e. a family of 
functions (@,) on X satisfying }>0,(x) = 1 for all x and whose supports are 
contained in the Vp. The map 


xr (p1(z),...,~n(x), A1(x),...,On(2)) 


is then continuous and injective, hence a homeomorphism from X onto its image 
since X is compact, and has rank d everywhere. This gives an embedding of X 
into RNY, 
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(iii) One-Parameter Subgroups of a Torus are the classical examples of 
immersions that are not necessarily open.°? To see this, take X = R and 
for Y, take the“torus” T?, where T is the unit circle of C. It is a compact 
submanifold of C? and, at the same time, is a (multiplicative) group which 
plays the same role for periodic functions with two variable as T in Chapter 
VII and leads to the same theory; the reader will invent it without difficulty 
and will even be able to generalize it to n variables... As always, setting 


e(t) = exp(2zit) , 
the map 


(13.7) f(t) = (e(at), e(bt)) , 


where a,b € R are given and both non-zero, is both a homomorphism of 
the additive group R to the additive group T? and an immersion since its 
derivative 


f'(t) = (2riae(at), 27ibe(bt)) 


is never zero. To deduce that the “one-parameter subgroup” Z = f(R) is 
a 1-dimensional submanifold of T?, we need to make sure that f is an open 
map from R onto its image equipped with the topology of T?. As will be 
shown, that is the case if and only if the ratio a/b is rational. Otherwise, f(R) 
is everywhere dense in T?, is neither locally compact nor locally connected 
with respect to the topology of T? and so is not a submanifold of T?. 


Let us first investigate the case when a/b is rational. Multiplying a and b 
by a convenient real number, a and b can be supposed to be coprime integers. 
Then there are integers u and v such that au + bv = 1 (Bezout’s theorem, 
who before the Revolution was the author of a famous Mathematics Course 
used in artillery schools that were predecessors of the Ecole Polytechnique). 
Then map (7) has period 1. Moreover, relation f(t) = (1,1), where this is 
the identity element of the group T?, requires at € Z and bt € Z, so that 
t = t(au+ bv) € Z. As f is a homomorphism from the additive group R to 
the multiplicative group T?, it follows that 


(13.8) fth=ft)—st-t eZ. 


Hence f(R) = f(Z), where I = [0,1], so that f(R) is compact and in particular 
closed in T?. But (8) shows that f is obtained by composing the map R —> 
R/Z with the map y from R/Z to T?, which is obviously continuous and 
injective. As the space R/Z is compact, a generalization of theorem 12 of 
Chap. III shows that y is a homeomorphism from R/Z onto its image f(R): 


5° Since the results of this section are not used in the rest of this Chapter, it can 
be skipped in the immediate. 
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any continuous and injective map from a compact space X to a space Y is a 
homeomorphism from X onto its image. To deduce that the map f : R — 
f(R) is open, it is, therefore, sufficient to check that so is the canonical map 
from R onto R/Z, which is clear. Thus the image of f (IR) is indeed a (compact) 
submanifold of T? when a/b is rational. 


In the general case, let us consider the homomorphism from R? onto T? 
given by 


(13.9) F(a, y) = (e(x), e(y)) 


and let D be the line of R? generated by the vector w = (a,b). As two points 
of R? have the same image under F if and only if they differ by a point of Z?, 
the inverse image of Z = F(D) = f(R) under F is the subgroup G = D+Z?, 
i.e. the set of d+w where d € D, w € Z?. We show that when a/b is irrational, 
G is everywhere dense in R?. 

Let w’ be a vector that is not in D; the vector space R? is the direct sum 
of D and the subspace D’ generated by w’ ; hence 


G=D+GnD', 


so that it all amounts to showing that the subgroup G’ = GN D’ of D’ is 
everywhere dense in D’. The set of ¢ € R such that tw’ € G’ is obviously a 
subgroup H of the additive group R, and it all amounts to showing that it is 
everywhere dense in R. 


However, for such a subgroup, there are only four possibilities:®° 


(a) H= = 

(b) H 

(c) His cae set mZ of integer multiples of non-zero m € R, 
(d) #H is everywhere dense in R. 


Since cases (a) and (b) do not present any problems, first note that H contains 
numbers t > 0 since t € H implies —t € H. Let m > 0 be the infimum of 
these numbers. 

If m = 0, for r > 0, there exists t € H such that 0 < t < r. However for 
all « € R and any real number t > 0, there exists q € Z such that 


ja —qt|<t. 


Applying this remark to some t € H such that 0 <t <r, this shows that we 
are in case (d). 

If m > 0, the same arguments show that, for all x € H, there exists q € Z 
such that 0 < 2 — qm < m, and so x — gm = 0; hence we are in case (c). 


6° Other formulations : (i) every closed subgroup of R other than R is of the form 
mZ,; (ii) every subgroup of R is either some Zm (possibly with m = 0), or is 
everywhere dense in R. 
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We can now return to the torus T?, to the subgroup H of t such that 
tu’ € G’ = GN D’ and to the subgroup 


G=D+G=D+2?, 


which is the inverse image of Z = f(D) under f. 

If H = {0}, then G = D and so Z? C D, which is absurd. 

If H =R, then G’ = D’, so that G contains D and D’, and so G = R?, 
ie. R? = D+Z?. The set of lines parallel to D is therefore countable ; which 
is absurd. 

If H is in the case (d), G’ = GND’ is everywhere dense in D’, G = D+G’ 
is everywhere dense in D + D’ = R?, and as f(R) = F(G), f(R) is clearly 
everywhere dense in T?. If a/b were rational, the image f(R) = F(G) would 
be compact, thus closed, and so F(G) = T?, which is absurd. 


To conclude, let us study f(R) = F(D) = Z in the neighbourhood of the 
identity element (1,1) of T?. First note that the map F from R? onto T? 
given by (9) is already surjective on the closed square 


K:0<a<1, O0<y<l 


and that it is injective on the open square. Drawing the verticals with abscissa 
p and horizontal ordinates q, where p,q € Z gives a grid of R? which cuts 
D into intervals that can be numbered by the n € Z; each of these intervals 
can be mapped by an integer translation into a line segment parallel to D 
contained in K whose endpoints are on each side of K. Figure 8 is obtained 
from the interval kK D. 

If the slope of D is rational, these segments are periodically reproduced. 
Indeed we saw at the start of the proof that by assuming a and b to be integers 
and coprime,®! relation f(t) = f(t’) is equivalent to t = t/modZ; but then 
the points tw and t’w of D, where w = (a,b), differ from each other by an 
element of Z?, so that the intervals of D containing them can be deduced 
from each other by an integer translation; by bringing them back into K, we 
thereby get the same line segments, and so the announced periodicity holds. 
These arguments also prove that all these segments can be obtained by only 
considering the intersections of the set of points tw € D with the grid, where 
0 <t< 1. There are obviously finitely many of them. The union A of these 
line segments constructed in K is, therefore, compact, and hence, as seen 
earlier, so is the curve F(A) of T?. 

If the slope of D is, on the contrary, irrational, then these segments are 
pairwise disjoint, for were it not so, there would be numbers t and t’ # 
t such that tw — t’w € Z?, so that w = (a,b) would be proportional to 
an integer vector: so the ratio a/b would be rational.® As f(R) = F(D) 


S! This means that the vector w = (a,b) generating D is a primitive element of Z?. 
We use this term for any integral vector belonging to a basis of Z”. 
°2 These arguments show that f is injective if and only if a/b ¢ Q. 
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Fig. 13.8. 


is everywhere dense in T?, the union A of the segments obtained in K is 
everywhere dense in K. The map F' being, however, a homeomorphism, if 
it is restricted to a neighbourhood of 0 in D, then the set of points of the 
trajectory Z = F(A) in the neighbourhood of the point (1,1) of T? is seen 
to be homeomorphic to the intersection of A and of a neighbourhood of 0 
in K; this intersection is composed of infinitely many pairwise disjoint line 
segments. In the neighbourhood of the origin in T?, the trajectory Z can, 
therefore, be decomposed into an infinitely countable number of excellent 
pairwise disjoint arcs of curve. Hence if Z is equipped with the topology of T?, 
we get a space, which while being connected, is not locally connected,® nor 
even locally compact since the intersection of A with a closed neighbourhood 
of 0 in K is obviously not closed and even less compact. All is now proved. 


The non-closed geodesics of the torus are examples of immersions f : 
X —- Y which, while being injective, are not homeomorphisms from X onto 


°3 A topological space Z is said to be locally connected if, for every « € Z and 
every neighbourhood V of x in Z, there is a connected neighbourhood U of a 
such that U C V: existence of arbitrarily small connected neighbourhoods. 
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their images. Such a couple (X, f) is sometimes called an immersed manifold 
in Y, a notion that should not be confused with that of an embedding defined 
above. The image f(X) is not generally a submanifold; it is a subset Z 
equipped with a manifold structure such that the map id : Z —> Y is an 
immersion. This is in particular encountered in the theory of Lie groups. 

The theory of differential equations contains many such phenomena. The 
movements of a gyroscope turning at constant velocity around an axis one 
of whose endpoints is fixed can be periodic, but it is an exceptional case. 
In general, the free endpoint of its axis, whose angle with the vertical varies 
between two limits determined by its initial velocity, describes a trajectory 
that is everywhere dense in the part S of the sphere contained between these 
two limit inclinations. It passes infinitely many times in the neighbourhood 
of every point of S. 

The problem detailed in this section can be generalized by considering 
the map 


tis (e(ait),...,e (ant) 


from R to T”. Its image is everywhere dense in T” if an only if the a; € R 


are linearly independent® over Q, i.e. if the relation 


@1a, +...+2nan = 0 L1,-.-,%n €Q 


implies that x; = 0 for all 7. The proof is the same, but slightly less easy since 
it first requires finding all the closed subgroups of R” : such a subgroup is the 
set of vectors which, with respect to a conveniently chosen basis (u;) for R” 
can be written as }> z'u;, where z’ € R forl <i<p,a2'¢€Zforp+l<i<gq 
and «' = 0 for i > q. Let us now return to more general manifolds. 


(iv) Submanifolds of a Cartesian space: tangent vectors. In the case of a 
p-dimensional submanifold X of a g-dimensional Cartesian space Y, a more 
concrete description of tangent spaces X’(a) can be given; it has no use — as 
remarked with reason by Dieudonné (Eléments d’analyse, vol. 3, p. 2), it may 
even lead the reader on a wrong track — but it makes it possible to relate 
“abstract” tangent spaces X’(a) to “tangent planes” of classical geometry. 
In what follows, we suppose that Y = RY‘, and we will identify R? to the 
Cartesian product R? x R4I7?. 

First note that, in this case, the 18th century method for representing a 
plane curve either by an equation y = f(a), or by an equation x = g(y), can 
be generalized here. Indeed if, as a general rule, we denote by y’ the canonical 
coordinates of some y € Y, we know that (end of (i)) in the neighbourhood of 
all a € X, the restrictions x of these g functions to X form a system of rank 
p. Hence, up a permutation of the canonical coordinates, the first p functions 
x’ may be supposed to be defined by a chart of X in the neighbourhood U 
of a in X, which means that the projection 


64 R is an infinite-dimensional vector space over Q. 
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yr (y",..., 9?) 


from R¢ = Y onto R? induces a diffeomorphism from U onto an open subset 
U’ of R?. The other coordinates x/ being of class C” on U, we get the relations 


(13.10) y=/f'(y’,...,y?) forall yeU (p+1<j<q), 


where f? € C™(U’). Any y € Y sufficiently near a so that (yt,...,y”) € U’ 
and satisfying relations (10) is then in U since there is a one (and only one) 
x € U whose first p coordinates are given in U’, its other q — p coordinates 
satisfying (10). In other words, 


(13.11) yebdeo(y',... ~)eU & Y=f(y’,...,y")- 


If (f?*',..., £2) is considered to be a map f from U’ to R?~?, this is equiv- 
alent to saying that in the neighbourhood of a, the submanifold X of R4 is 
the graph of a map from R? to R4~”. Simply using coordinate changes given 
by canonical permutations, precursors were right to believe that, locally, any 
“good” curve or surface is the graph of a “good” function. As seen in sec- 
tion (i), this is in particular the case if X is defined globally by the “implicit 
equations” F*(y!,...,y?) = 0 provided the rank of the map y + (F*(y)) is 
constant in the neighbourhood of X or else X is the image of an open subset 
of R? under an open immersion. 

Having said this, the elementary classical definition of a tangent vector 
u to a submanifold X of Y at x, for example to a sphere, says that there 
is a curve t +} y(t) in X such that u(0) = x, w’(0) = u, where, py’ (0) = 
lim[u(t) — (0)]/t here. The setT;,(X) of these vectors is a vector subspace®? 
of Y, not to be confused with the abstract vector space X’(a). Indeed, in 
the neighbourhood of x, X may be supposed to the graph of a map f from 
R? to R?-?. As R®¢ is identified to R? x R?-?, « = (a,b) with a € R? and 
b = f(a) € R®®?. Similarly, w(t) = (u(t), we(t)) with pwi(t) € R? and 
pio(t) = flui(t)] since p(t) € X. 1 can be arbitrarily chosen and, setting 
k = p),(0) = lim [1 (t) — 41 (0)] /t, it follows that 


(13.12) u=p'(0) =(k, f/(a)k) - 


We, therefore, conclude that T,(X) is the image of R? under the linear map 
k + (k, f’(a)k), which gives the result. We thus recover the fact that, for 
q = 2, p = 1, the slope of the tangent to the graph of the function f at 
the point (a, f(a)) is the number f’(a) the linear map f’(a) of the general 
case can be identified to in dimension 1. Observe also that, contrary to what 


°° Tn classical geometry, the set T;,(X) thus defined is not the “tangent plane” to 
X at x; depending on the point of view taken, it is a set either of points of Y 
or of vectors with initial point x. T,(X) is the vector subspace of Y which can 
be deduced from the traditional tangent plane by using the translation mapping 
the point x to 0. 
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M. Mandelbrot seems to think, we do not eliminate “geometry” from our pre- 
occupations: our definitions and even our notation directly generalize those 
of the 17th century. 

But as seen in (ii), the curve y. drawn in X makes it also possible to define 
some h € X’(a) independently from the embedding of X into a Cartesian 
space: under every local charts (U, y) of X at x, 44 becomes a curve t +> y[u(t)] 
in R” and, cf. (12.8), 


n(y) = “elu(t)] for t= 0. 


If, as above, the submanifold X is defined in the neighbourhood of x = (a, b) 
by an equation y = f(x) and if u(t) = (1 (t), a(t)), the map (x, y) + x from 
X to R? can be chosen as (U, y) since it defines the manifold structure on X. 
Then y[u(t)] = 41 (t) and so h(w) = 14 (0), where this is the usual derivative. 
Hence, by (12), 


(13.18) w'(0) = (A(y), f(a)a(y)) 5 


where f’(a) is the tangent linear map to f at a in the usual sense of n° 2, 
(i). As the map h++> h(y) from X’(x) to R? is linear and bijective, and as 
formula (12) likewise defines a linear bijection k +> h from R? onto T,,(X), we 
thereby obtain by composition an isomorphism h +> wu from the “abstract” 
vector space X'(x) onto the “concrete” vector subspace T,(X) of R?. For 
any curve ys drawn in X and such that (0) = x, this isomorphism maps the 
“abstract” tangent vector pu’(0) € X’(a) onto the vector 


lim [y-(t) — 4(0)] /t 


also written y’(0) by everyone. This also proves that the isomorphism of 
vector spaces X’(a) —+ T;,(X) defined thereby is absolute, i.e. depends only 
on the embedding of X into R? and not on the choice of the chart. 

And it is not understandable why this assimilation is a false trail, in the 
words of Dieudonné: it in particular suggests that the vector spaces X’(x) and 
X'(y) tangent to X at two different points x and y could, like the subspaces 
T,(X) and T,(X) of R?, have common elements; this is not at all the case: 
an element h of X'(x) is a couple consisting of a point « € X and of a family 
of vectors h(y) of R? depending on a local chart at x; two tangent vectors 
at x and y cannot be equal if « 4 y. In fact, the set of tangent vectors to an 
n-dimensional manifold is a 2n-dimensional manifold [n° 12(v)]. 


(v) Riemann spaces. The definition of tangent spaces makes it possible 
to define Riemann spaces. For this, in each X’(a) take an Euclidean scalar 
product (h|k) compelled to depend on x in a reasonable manner: suppose 
that the functions 


915 (§) = (ai(€)|a5 (6) 
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are at least C? on each chart (U,y). If h = h'a;(€) and k = k*a;(€) are two 
tangent vectors to X at x, then 


(h|k) = gag (€)h'k? 
and in particular, for the metaphysical vector dx = a;(€)dé", 
ds? = (da|dx) = gij(E)dé'dé? , 


i.e. the square of the length of dx. The length of a curve py: [a,b] WH X can 
then be defined by the formula 


b 
m1) = / (ui (tut (ay)? at 


and it is possible to look for the geodesics, i.e. the curves of minimal length 
connecting two given points of X, if they exist. It is also possible to define the 
covariant derivative of a tensor field, etc. All this has given rise to an extensive 
literature whose latest avatar seems to be Serge Lang’s book, Riemannian 
Geometry (Springer, 1999). 

If X is a submanifold of an Euclidean space EF (i.e. a Cartesian space 
equipped with a Hilbert scalar product) and if h,k € X'(x) are defined by 
the curves y and v as mentioned in (iii) of n° 12, it is natural to set 


(hlk) = (u'(0)|v"(0)) , 


where ’(0) and v’(0) are defined as limits. Equivalently, note that X’(a) can 
be canonically identified to a vector subspace of E’(a), hence to E, which 
gives the scalar product sought in X’(a). The explicit computation of the ds? 
of X is particularly simple when X is defined by a parametric representation 
x = (t) in the neighbourhood of point a, where t varies in an open subset of 
R¢ and where o is an open immersion. Then X‘(z) is the image of R? under 
o'(t) if = o(t), so that if, in Leibniz style, we set dx = o'(t)dt, then 


ds* = (o'(t)dt\o’(t)dt) . 
For example, for the unit sphere in R® in spherical coordinates 
Z=cosycosw, y=sinycosw, z=siny, 


we differentiate x,y and z, we simply calculate dx? + dy? + dz?and we find 
that ds? = cos? wdy? + dy”. 

It can be shown that for any connected Riemann space X, there is a 
diffeomorphism from X onto a submanifold Y of some space R” which trans- 
forms the given ds? in X into that of Y (John Nash, 1956). Dieudonné (vol. 4, 
XX.15) proves a far weaker result (E. Cartan) as it is purely local, but it is 
already difficult. 
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Since we have defined tensors at a point of a manifold X of class C” in n° 12, 
(i), we can now define tensor fields of type (p,q) on X or, more generally (?), 
on an open subset of X. They associate to each point x € X a tensor of type 
(p,q) at the point x considered, either as a multilinear function of p vectors 
and q covectors at x, or as a system of numbers dN depending on x and 
on a local chart at x, and compelled to transform by the Italian formulas 
under chart change. T will be said to be of class C® if in every local chart, its 
components are C’* functions on the corresponding open Cartesian subspace; 
we need to assume that s < r—1 because if changes of charts are of class C’, 
their first partial derivatives are only of class C"~+; thus there is no hope of 
the Italian formulas transforming C” functions into C” functions. 

In practice, the most important tensor fields are the vector fields and 
the differential forms that will be defined below. A vector field L is of type 
(0,1), so that it is obtained by associating to each x € X a tangent vector 
L(a) € X'(x) to X at x. Hence, if (U, vy) is a chart, then for each x € U there 
is a vector 


L(x)(y) = L'(€)e; where€ = (zx), 


i.e. a vector field (in the sense used by physicists) on the open subset y(U) 
of R4, d = dim(X) with the obvious formulas for chart change. L will also be 
said to be of class C* if so are the functions L’ . 

If f is function of class C*(1 < s <1) on an open set W C_X, then set 


Lf (x) = df |x; L(x) 


for all « € X; many authors prefer to denote the function Lf by Dz f. If 
(U, yp) is a chart, with U C W, and if F is the function which expresses f in 
y(U), so that f = Foy, then, by the multivariate chain rule, 


Lf (x) = dF [p(x); y'(2)L(z)] ; 
but by (12.13), we know that for h € X"(2), 
g' (x)h = h(y) = h'(p)ei- 
Setting € = y(z), it follows that 
(14.1) Lf(x) = dF [& L'(€)e;] = L'(€)dF (€¢:) = L'(Q)DiF (€), 


where the D;F are the usual partial derivatives of F and the L’(x) the com- 
ponents of L(x) in the chart considered. Obviously, 


L(fg) =Lf.g+ fLg 


for all f and g. 
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Relation (14.1) suggests a generalization to manifolds of (linear) classical 
differential operators. For simplicity’s sake, on a manifold X of class C'°, 
a differential operator of order < p and class C'* defines, for every open 
subset U, a linear map L : C°(U) —> C™(U) satisfying the following two 
conditions : (i) it is compatible with the restriction maps C°(U) —> C™(V) 
for V CU, so that for f € C*(U), the value of the function Lf € C™~(U) at 
x € U only depends on the behaviour of f in the neighbourhood of «, (ii) in 
each chart (U,y) of X, there is a relation of the form 


P 


(14.2) Lf@y=>) DD Pe *©)Dy... Da FO, 


k=0 in,..-,in 


where the functions L"--** only depend on the chart considered and where 
&, F and the D, are defined as in (1). 

Obvious operations can be defined on these operators: sum D+, product 
LM, product fL by a function. If LZ and WM are of order p and q respectively, 
their product LM : f +> L(Mf) is generally of order p+ q, but their Jacobi 
bracket 


(14.3) [L, M]=LM-—ML 


is of order < p+q-—1. It suffices to check this in R” ; we can then suppose 
that 


L= pDi, aa Di, = pDii) ; M= WD;, yas D;, => D5) 


with given functions y and w, and it all amounts to checking that, if we 
calculate 


LMf — MLf = gDw [bDyyF] — ¥Dw [PDwF] 


then the terms py Di) Diy) f and ppD ij) Di) f cancel each other. 

In particular, if Z and M are defined by vector fields, the same holds for 
[L, M]; using Einstein’s convention, it can be immediately seen that , in any 
chart, 


(14.4) L=L'D;, M = M'D; => [L, M] = N'D; 
with 
(14.5) N* = L).D; (M*) — M!.D; (L’) . 


Hence, if D;, denotes the differential operator f + Lf, the vector field [L, 1] 
satisfies 


Ditm) = DrDM — DMDL. 


Exercise 1. Verify by a direct calculation that the vector field defined by 
(5) does not depend on the chart used. 
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Exercise 2. Let [1,..., Ln be C® vector fields on an open subset U of X ; 
suppose that, for all « € U, the L;(x) form a basis for X’(x). Show that all 
differential operators on U can be written uniquely as a finite sum 


v> Gp va (La said 
pi20 


Note that as [L,M] = N involves the derivatives of the components of L 
and M, the value N(x) € X'(a) of N at x does not only depend on the vectors 
L(a#) and M(«). The bracket [h, k] of two tangent vectors is not well-defined. 

The existence of vector fields satisfying global conditions poses problems 
related to the topology of manifolds: does there exist vector fields in X such 
that L(x) ¢ 0 for all x? If X has dimension n, are there n vector fields L; 
such that the L;(a) form a basis for X‘(a) for all 2 € X ? The answers are 
already in the negative for the 2-dimensional sphere. 


15 — Vector Fields and Differential Equations 


Vector fields serve to generalize the theory of first order differential equations 
to manifolds; we can start by searching for integral curves or trajectories 
t + 7(t) from a vector field Z onto a p-dimensional manifold X; they are 
defined by the condition 


(15.1) V(t) = Lh), 


where the left hand side is defined as in n° 12, (ii). In a local chart (U, vy) 
such that y(U) = R?, this equivalent to looking for a function x(t) = y|y(t)] 
satisfying 


(15.2) Dx(t) = L{x(t)], 


where D = d/dt and where the functions x(t) and L(a) have values in R?. 
When L is C?, the following results hold: 


(a) for allto € R and xo € X, there exists a solution of (1) defined in the 
neighbourhood of to and such that (to) = Xo; 

(b) two such solutions coincide on the interval on which they are simulta- 
neously defined. 


Statement (a) will follow from the analogous statement for equation (2). 
Similarly for (b), because if two solutions of (1) defined on the same interval I 
and equal at some point t € I are known to be also equal on a neighbourhood 
of t, then the set of t € I where they are equal is both open and close 
(continuity) in J, and so are equal to I. 

Statement (b) shows that the solutions of (1) with a given value at a 
given point to are the restrictions to their intervals of definition of a unique 
solution, defined on the union J of these intervals; I is the largest interval on 
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which (1) has a solution, which is, therefore, said to be maximal. I is clearly 
open because of (a), but in general J 4 R: if X is an open subset of R? and L 
is a vector field whose vectors can de deduced from each other by parallelism, 
the trajectories are open line segments contained in X having as endpoints 
border points of X. 

If, for all x € X, y2(t) denotes the maximum trajectory such that y,(0) = 
x and I,, is its interval of definition, we are led to introduce the set 2 C Rx X 
of (t,x) such that t € I, and to set 


(15.3) y(t, ©) = Ye(t) 


for (t,a2) € 2. This give a map 7 : 2 —> X, the global flow of the vector 
field L. The following additional result then holds: 


(c) If L is of class C*, then Q is open in Rx X andy is of class C* in Q. 


Similarly to (a) and (b), this statement is of a local nature: it amounts 
to showing that the solution of (2) satisfying x(0) = € is defined and a C* 
function of (¢,€) on the product of an interval centered at 0 and a ball with 
given centre. 

There are analogous statements for more general differential equations: 
instead of a function D(a) of the only variable x € R?, L can be supposed to 
depend on t, x and a parameter z varying in a Cartesian space. The purpose 
is then to find a function x(t) satisfying 


(15.4) a(t) =L[t,2(t),z], «2(0)=€ 


and to show that if L is C*, then x(t) is a C* function of t, € and z. 

Everything on the subject and even more can be found in Dieudonné, 
vol. 1, X.4 a X.8, but as this reference is not very easy to read, here we will 
substitute a low-powered proof to this type of high-powered one (expression 
used by Spivak for Serge Lang’s Analysis IT) in the manner of the inventors 
of the successive approximation method, for example Emile Picard. We have 
already used it in a particular case related to Bessel’s equation in Chap. VI, 
n° 10. 

(i) Reduction to an integral equation. If L is assumed to be continuous 
and if we consider x(0) = €, (4) amounts to solving 


t 

(15.5) x(t) =€ +f Llu, x(u), z] du, 
0 

where this is an oriented integral. Setting 


(15.6) a(t)=y(t)+§, Lity+é,2z)=Lty,€2) 


we are led to solve 


254 IX — Multivariate Differential and Integral Calculus 


As € and z are now parameters varying in Cartesian spaces, we might as 
well consider the parameter to be the couple (£€,z) and remove € from the 
notation. Hence, we will suppose that the new function L(t, y,z) is defined 
and C* on a compact subset |t| < a, |ly|| < 6, ||z|| < ¢ of R x R? x R®. 

To solve 


(15.7) u(t) = Llu, y(u), 2] du, 


define the functions y(t, z) by 


(15.8) yo(t,z)=0, Ynsilt,z) -| L {u, yn(u, 2), 2] du 


and hope they converge uniformly on every compact interval, in which case 
their limit is the solution of the problem. 


(ii) Existence of solutions. Temporarily omitting the parameter z from 
the expressions for y,, we get 


(15.9) unsa(t)— a(t) = | {E (tt, ¥n(t), 2] = E bey tea), 2} de. 


An upper bound for the integral can be found by using the mean value 
formula: if D2L(t,y,z) denotes the derivative of the map y +> L(t, y, z) and 
if 


||DoL(t,y,2)|| <M" for |t| <a, |lyll <4, lll <e, 
then 
(15.10) ||L[u, yn(u), 2] — L[u, Ym—a(u); 2] || S$ M"lyn(u) — yn—1(u)|| 
for |u| < a and ||z|| < c as long as the y,(w) remain in the ball ||y|| < 0. 


But let M = sup ||L(t,y,=)]| for |t| <a, lyll <0, llell Sc; if llyn()I| <0, 
(8) shows that |)yn+41(t)|| < M|t| < 6 if |t| < b/M. Setting 


a’ = inf(a,b/M), 
we see that if the relation 
(15.11) |t| < a’, |z| < |e] => llyn()|| <b 


is true for some n, then it is true for n+ 1. As yo(t) = 0, (11) holds for all n, 
which makes its possible to use (10) within the limits indicated for ¢ and z. 

(8) primarily shows that ||y1(t)|| < M|t|. Integrating from 0 to ¢ in all 
integrals in u of this n°, we then get 
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Ivete) — w(t <M" f lyn (u)lldu < ma’? 2, 
Ivat) — vole] <M" f lyalu) —yi(w)|Idu < e493 
and so on until 


(15.12) \|¥n+1(t) — yn(t)|| < MM'"—14"/n! for |t| <a’. 


The series )>[Y¥n+i(t) — yn(t)], dominated by an exponential series, thus con- 
verges normally in |¢| < a’ to a solution y(t) = y(t, z) of (7) defined for |t| < a’ 
and |z| < c and with values in |y| < b, proving statement (a). Note that this 
result only assumes L to be continuous and to have a continuous derivative 
D2L(t, x, z). 

A very particular and important case is that of a linear differential equa- 
tion, i.e. for which L is of the form 


L(t, y, z) = A(t, z)y + (ft, z) 


with functions A(t, z) and b(t, z) defined and continuous for |t| < a, |z| < c. As 
L(t,y,z) is defined without any restrictions on y, successive approximations 
Yn(t, Z) are defined and continuous; hence there are no other restrictions 
on the domain of existence of solutions apart from those imposed on the 
coefficients A(t, z) and b(t, z). 


(iii) Uniqueness of the solution. A similar calculation to (9) and (10) 
shows that if y’ and y” (these are not derivatives) are solutions of (7), then 
setting 


k(r) = 7 ly’) — 9" ll, 


gives 


lly’) — y" Il S i: I|L[u, y'(u), 2) — Llu, y(u), 2]||du 
< M’k(r)|¢| 


for |t] <r as long as y’(w) and y’’(u) remain in the ball ||y|| < 6, which is 
the case for sufficiently small r since (7) implies y(0) = 0. Taking the sup 
of the left hand side for |t| < r, k(r) < M’k(r)r follows, and so k(r) = 0 if 
r <1/M’. This proves statement (b). 


(iv) Dependence on initial conditions. It is now a matter of showing that, 
for ¢ and z in the neighbourhood of 0, like L, the solution y(t, z) of (7) is of 
class C* as a function of (t, z). Suppose first that k = 1; it all amounts to 
showing that the functions (8) are C1 and that their first derivatives converge 
uniformly (Chap. III, n° 22, theorem 23). 
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Formula (8) shows that, if L and yp (t, z) are C*, so are yn+1(t, z), proving 
the first point. As Dyn+1 = Yn, where D = d/dt, the Dy,(t,z) converge 
uniformly on |t| < a’, |z| < c, and so the continuity of Dy(t, z) follows. The 
case of derivatives with respect to the parameter z cannot be dealt with so 
easily. In what follows, Dz (resp. D3) will denote the operator transforming 
a function f(t, y, z) into the derivative of the map y (resp. z) > f(t,y,z); 
they correspond to the partial differentials of n° 2, (iii). We, therefore, need 
to prove the convergence of the functions 


Yn(t, z) = DsYp(t, z) : RY — R?. 


Applying Ds; to the function z > L[u,yn(u,z), 2] under the f sign in (8), 
first 


(15.13) Ypsi(t,z) = / { DoL [u, yn (u, 2), 2] -Yn(u, 2) + 
+D3L [u, yn(u, z), 2]} du = 
= xc z)¥,(u, z) + Va(u, 2)]du, 
where 


U,(u, z) = DoL [u, yn(u, z), 2] : R? —> R?, 
V,(u, z) = D3L [u, yn(u, z), 2]: RY — R?. 


However, D2L and D3L are uniformly continuous on the compact subset on 
which they are defined and y,,(u, z) converges uniformly to y(u, z). Hence 
(15.14’) DL [u, y(u, z), 2] =U(u, z) = lim U;, (u, z) 

(15.14”) D3L [u, y(u, z), 2] =V(u, z) = lim V,, (u, z) 


uniformly on |u| < a’, |z| < c. If the left hand side of (13) converges uniformly 
to a limit Y(t, z), it will, therefore, satisfy 


(15.15) Y(t,z) = | [U(u, z).Y(u, z) + V(u, z)] du. 


However, (15) is an integral equation of type (8), where L(t, y, z) is replaced 
by a linear (affine) function M(t, Y, z) = U(t, z)Y + V(t, z) in Y. As seen at 
the end of section (ii) of the proof, (15) has a unique solution, defined for 
|t] <a, |z| < c and so 


(15.16) Dy(t, z) = ||Yn(t, z) — Y(t, z)|| 


needs to be shown to converge uniformly to 0 on the compact subset |t| < a’, 
lz] <e. 


§ 4. Differential Manifolds 257 
However, (13) and (15) show that 
Dnsilt, 2) < / |Un(u, z)Yn(u, z) — U(u, z)¥ (u, z)|| du + 
+ |Vi(u, 2) — V(u, z)|| du. 
< [ieatu2) — U(u, 2)|] ]¥(u, 2)|] du + 
+ |Un(u, z)|| Dn(u, z)du + 
+ |Vin(u, z) — V(u, z)|| du 
with extended unoriented integrals over the interval I(t) with endpoints 0 
and t; as I(s) C I(t) for s € I(t), the right hand side is even an upper bound 


for Dn+1(s,z) for all s € I(t). If k is a constant upper bound for Y(u, z) and 
the U,,(u, z) — they converge uniformly — for |u| < a’ and |z| <c, then 


Dn4il(s,z) < ey ||Un(u, z) — U(u, z)]| dutk f Do(u2)du+ 
+ f a(w2)—V(u,2)I du, 


where integration is over I(t), and not only over I(s). 
Let r > 0. There exists N = N(r) such that 


|Un(u, 2) -—Uu,z)|| <r & |Valu,z)-V(u, 2)| <r 


for n > N, |u| < a’, |z| < c. The previous relation then shows that, for 
n>QN, 


(15.17) Dn4il(s,z) < (K+ 1)|t| + a Dy (u, z)du 
I(t) 


for s € I(t). Set A=k+1 and 


An(t) = sup Dp(u, z). 
u€ I(t) 
lzl|<e 


The function being integrated on the right hand side of (17) is < A,(w). 


Taking the sup of the left hand side for s € I(t) and |z| < c, it can be 
deduced that 


(15.18) Ansi(t) < rAlt! +f A teveis 
I(t) 


for n > N, and so, iterating, 
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An+o(t) < rAlt| +a f in ra +A] An(v)dv 
I(t) I(u) 


A gt gy An(uadde 
t U1 


and so on. As A,(u) < A,(t) for u € I(t), it follows that 


- a us. f Ap, (Up)dup < 
I(t) I(u1) I(up—1) 


< An(t) | | dus . a du, = A,,(t) (|t|)? /p!. 
I(t) I(ui) I(up—1) 


Hence iterating (18) gives 


Ayag(t) <7 (Alt] Fas APE fol) +A, GAP tl fo! 
<r [exp (Aa’) — 1] + A(t) (Aa’)? /p! 


for n > N(r), p > 1 and |t| < a’. (The reader will recognize these to be 
arguments given by Liouville already outlined in Chap. VII, n° 18). On the 
right hand side, the first term is arbitrarily small if r is conveniently chosen, 
and the second one tends to 0 as p increases. Hence lim A,,(t) = 0 for all t 
and in particular for t = +a’. So D3y,(t, z) is uniformly convergent. Hence, 
the function y(t, z) is indeed Ct. 

It remains to show that it is C’ if L is C*. For k = 2, the right hand side 
of the differential equation 


Dy(t,z) = L{t, y(t,z),z], y(0,z) =0 


is C' like L and y, so that D?y(t, z) and D3Doy(t, z) exist and are continuous. 
On the other hand, Dsy(t, z) = Y(t, z) satisfies (16), i.e 


(15.19)  DY(t,z) =U(t,z)Y(t,z)+V(t,z), Y(0,2) =0. 


As U(t,z) and V(t,z) are C! by (15) if L is C? and if y is C1, solution 
Y(t, z) of (19) is C+. The function y(t, z) is, therefore, C?. And so on, which 
finishes the proof of statements (a), (b) and (c). I give up turning them into 
a theorem whose statement would take half a page. 


(v) Matrix exponential. As seen above, the derivative Y(t, z) = Dsy(t, z) 
satisfies a differential equation (19) whose right hand side is an affine linear 
function of Y. Equations of the form 


a'(t) = A(t)a(t) + b(t), 20) =€, 


where «(t), b(t) € R” and A(t) € M,,(R), are dealt with by using the general 
method, but there is an additional result in this case: the integrals are defined 
on every interval on which A(t) and b(t) are defined and continuous. 
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A particularly simple case is that of an equation 
a'(t)=As(t), 2'(0)=€ 


with a constant matrix A. The successive approximation method leads to 
functions 


xi(t) = e+ f Agdu = (1+ Atyjé, 


a(t) =€+ / A(1 + Au)é.du = € + Até + A*tPle, 
etc., whence the solution 
(15.20) x(t) = exp(tAa)é, 


where for any matrix or linear operator A, we set 


(15.21) exp(A) = 5° AM = S° A" /nl; 


the series converges since ||A”|| < ||A||”. 
The binomial formula shows that, like in dimension one, 


(15.22) exp(A + B) = exp(A)exp(B) if AB= BA. 
As exp(0) = 1, the operator exp(A) is, therefore, always invertible, with 
exp(A)~' = exp(—A). 


On the other hand, if A is replaced by UAU~!, where the matrix U is in- 
vertible, each term A” of the series is replaced by UA"U~!, and so 


(15.23) exp(UAU~') = Uexp(A)U~*. 


(22) also shows that the map t ++ exp(tA) = X(t) is a continuous homo- 
morphism, and is even C®, from the additive group R to the multiplicative 
group GL,,(R); it is called a one-parameter subgroup of GL,,(R). It is the 
only one. 

Indeed, if X (¢) is supposed to be differentiable at t = 0, formula X(t+h) = 
X (t)X(h) shows that, like in dimension one, X(t) is differentiable everywhere 
and that X/(t) = X’(0)X(t) = AX(t), where A = X'(0). Although X(t) now 
has values in M,,(R) rather than in R”, X(t) = exp(tA) again holds since the 
two sides satisfy the same differential equation with the same initial condition 
X(0) =1. 

If the only assumption made is that the function X(t) is continuous, it is 
“regularized” by choosing a function y on R in the Schwartz space D and 
by considering the integral 


[vt —u)X(u)du = fowxe —v)dv = X(t) i p(v)X(—v)dv. 
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The first integral, extended to R and in fact to a compact subset, is a C°®° 
function of ¢ like y, and the third one can be made to approach X(0) = 1 
by using a Dirac sequence y,, (Chap. V, §8, n° 27). So X(t) is indeed C™. 
This remark has already been made in the classical case [Chap. VII, § 1, n° 2, 
(iii)]. 


Finally note the useful formula 
(15.24) det [exp(X)] = exp [Tr(X)] , 


where, for any square matrixX, Tr(X) denotes the sum of the diagonal ele- 
ments of X. Setting D(t) = det[exp(tX)] indeed defines an obviously contin- 
uous homomorphism from R to R, and so D(t) = exp(ct) for some c = D’(0) 
to be computed. As the map det : 1/,,(R) —> R is much more than differen- 
tiable, the multivariate chain rule shows that D’(0) = det’ (1) exp’(0).X, where 
det’(1) is the derivative at X = 1 of the map X +> det X and exp’(0) = 1 
that of the exp map at the origin. Hence c = det’(1)X = T(X) with a real 
valued linear function of X. By (23), 


(15.25) T(UXU—') =T(X) for all UC GL, (R), 


and so T(UX) = T(XU). As it is not hard to check that every Y € M,,(R) 
is the sum of invertible matrices (Y + Al is invertible provided —X is not an 
eigenvalue of Y), it can be deduced that 


(15.26) T(XY -YX)=0 


for all matrices X and Y. Setting T(X) = a} Xi, where the X} are the 
coefficients of X, and by writing (26) explicitly for a matrix Y all of whose 
entries, apart from one, are non-zero, it immediately follows that a? = 0 for 
i#J, so that T(X) is a linear combination of the diagonal entries of X. Its 
value remains invariant if its entries undergo an arbitrary permutation, for 
this amounts to replacing X by UXU~!, where the matrix U permutes the 
vectors of the canonical basis for R”. Hence TX) is proportional to the trace 
of X. It remains to check that T(X) = Tr(X) for a matrix X with non-zero 
trace; we leave it to the reader to choose this matrix in such a way as to 
minimize calculations. 

This proof of (24) is slightly longer than the classical proof, but it teaches 
the reader, if he does not already know it, that, up to a constant factor, 
relation (26) characterizes the function X + Tr(X). 

Exercise 1. Show that, for every linear functional X ++ f(X) on M,,(R), 
where R denotes an arbitrary commutative field, there is a unique matrix A 
such that f(X) = Tr(AX). 

Exercise 2. Let E be an n-dimensional vector space over R, M(F) the 
set of linear maps E —> E, (a;) a basis for EF and f(a1,..., 2) the unique 
n-linear alternating form equals 1 on the basis vectors, i.e. is the determinant 
of the x; with respect to this basis. Hence if u € M(£), then 
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det(u) = f [u(a1)...,u(@n)] = f (u1,---,Un) , 


where u; = u(a;). Show that the derivative 
det '(u) :hr> © det(u +th) for t=0, 
which takes M(£E) linearly onto R, is given by 
he—-> f (ur + hi, ua,...,Un) +... + f (u1,---,Un-1, Un + hn) , 
where h; = h(a;). Deduce that 
(15.27) det '(u)h = det(u)Tr(h) . 


Exercise 3. Show that the group SL,(IR) of X € M,,(R) such that 
det(X) = 1 is a closed submanifold of M,,(R). 

The § where we will discuss Lie groups in vol. IV will provide an oppor- 
tunity to apply the results of this n° in a particularly important case. 


16 — Differential Forms on a Manifold 


The general definition of tensor fields makes the notion of a differential form 
of degree p on a manifold X obvious: such a form associates to each point x 
of the open subset of X a p-linear alternating form w(x;h1,...,hp) on X'(x) ; 
hence, in every local chart (U, vy) at x, for p = 3 for example, 


(16.1) w(x; h, k,l) = aige(€)h*(p)k (p)I*(y) 


with antisymmetric coefficients depending on the chosen chart. The right 
hand side of (2) is a differential form w(y) on y(U) and if (U, y) is replaced 
by (V,w), under the chart change diffeomorphism g(x) 4 w(x), w(w) is 
transformed into w(w) as inverse images ; the a;;,(€) are the components of 
a tensor. Conversely, considering in every chart a form w(y) which, under 
chart change, is transformed as above, defines a form w on X. 

The definition of the exterior product of two forms generalizes in an ob- 
vious way. This makes it possible to write 


; . ae | . . : 
(16.2) w= >> aijndt’ A dé? A de® = gi isn A dé) dé* 
i<j<k , 
as in R”. 
The notion of the inverse image of a form under a map f:X —> Y also 


easily generalizes: if w is a form of degree 3 for example in Y, the form 
@=wof on X is defined by 


(16.3) (a; hi, ho, hs) =w (f(x); f"(w)ha, f"(@)ha, f'(a)hal « 
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This is the only formula likely to still be well-defined here. 


To define the exterior derivative dw of a given form w on X, the forms 
w(y) expressing w in the charts of X may be used. As in Cartesian space, the 
operation “inverse image” transforms an exterior derivative into an exterior 
derivative, the dw(y) clearly define a form on X, namely the derivative dw 
we were looking for. If for example 


(16.4) w(p) = ajnde? A dé* 
in (U,), then 
(16.5) du(y) = da, A dé? A d&* = 
= : (Diajx + Djari + Deaiz) dé! \ dé? A d€* . 


To define dw as we did in a Cartesian space, the covariant derivative w’ 
of w should first be defined, and more generally that of a tensor field T on 
X. But the definition 


d 
(16.6) T' (a;h,k,u) = ast +sh;k,u) for s=0 


of T’ is not well-defined in a manifold, and defining T by differentiating its 
components in each local chart would involve second derivatives. Formula 
(6) interpreted as we have done will, therefore, not lead to the components 
of a new tensor. The solution of the problem can be found in the theory 
of “connectedness”, which will not be presented here. Let us only observe 
that when we investigate the manner in which the partial derivatives of the 
coefficients of a differential form are transformed under chart change, the 
second derivatives, which for an arbitrary tensor field occur in the formulas, 
disappear: a miracle of the antisymmetric character of the coefficients, as the 
reader can check with some patience. 


17 — Integration of Differentiable Forms 


All that has been done at the start of § 2 on “curvilinear” integrals over open 
subsets of a Cartesian space generalizes immediately to differential forms of 
degree 1 on a manifold X. If w is such a form and y : I —> X is a path of 
class C! in X, for simplicity’s sake, the integral of w along y is obtained by 
replacing w with its inverse image wo under ¥ and by integrating the result 
over [0,1]. The extended integral of a form of degree 2 over a 2-dimensional 
patho: Ix I = K —> X is defined likewise. Then for w of degree 1, 


f= Le 
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as in (9.20). The invariance of the integrals of w under homotopy now follow 
if w isclosed. Notice that w is an exact differential if and only if its integral 
along every closed path is zero. Etc. 

A much more serious problem consists in defining the extended integral 
of a form of maximum degree on X without taking a “parametric represen- 
tation” of X for granted for otherwise we would only need to integrate over 
a cube. 


(i) Orientable manifolds. Let X be an n-dimensional manifold and w a 
differential form of degree n on X and of class C°. Whatever be the final 
definition of the integral of w, if X is not compact, we will clearly come 
across convergence problems at infinity unrelated to the problem at hand. 
This is already obvious if X = R. It is, therefore, prudent to suppose that w 
has compact support®® in order to get rid of them, as we did at the start of 
Chapter V. 

The simplest case is obtained by supposing the existence of a chart for all 
of X, in other words, that X is diffeomorphic to an open subset of a Cartesian 
space. For this, let us choose a diffeomorphism y from X onto y(X) C R"; 
it transforms w into a form 


w(yp) = a(€)\dl A... Ade”, 


on y(X) and we wish to set 


(17.1) |e 


This a an ordinary multiple integral over the open set y(X) of a function 
which vanishes outside a compact subset of it. 

If w is another diffeomorphism from X onto an open subset of R”, then 
there is a form 


w(%) = b(n)dn' A... A dn” 
on ¢(X). As w(y) is the inverse image of w(w) under the chart change dif- 
feomorphism 4 : y(X) — 7(X) 
a(f) = b[A(E)] Jo(€) - 
Since the change of variable formula for multiple integrals showing that 
[vara =f 616(6)]|Jol@)lagt «de 
W(X) e(X) 


we are led to conclude that the two values proposed for the integral of w are 
equal only if Jg(€) > 0 everywhere. This is not a good sign in every sense of 
the word since we want a result independent of the coordinate system used. 


°° Recall that it is the largest closed set Supp(...) outside which the function (or 
differential form, or...) is zero. 
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The same difficulty arises in the case of a general manifold. A simple case 
is that of a form w whose compact support is contained in the domain of a 
chart (U,y); it would be natural to set 


fo= he 


where the right hand side is defined by formula (1) applied to U. If the chart 
(U, y) is replaced by the chart (V, w) such that Supp(w) C V, then, applying 
(1) to U or V, it clearly suffices to integrate over UNV ; the arguments used 
above then show that, here too, both definitions of the integral coincide only 
if the Jacobian of change of coordinates p(x) + w(x) is everywhere > 0 in 
UNV. 

To overcome this difficulty, we are led to allow only changes of local charts 
whose Jacobian is everywhere > 0, more precisely, instead of all possible 
charts, to use only an atlas all of whose changes of charts have positive 
Jacobian. The existence of such an atlas — recall that the charts in an atlas 
must cover the manifold — is not obvious and may be false for some manifolds. 
When such an atlas exists, the manifold is said to be orientable. 

The relation with the classical notion of orientable surface in R® recalled 
in n° 9, (iv) is easy to see. Indeed, let X be a 2-dimensional submanifold of 
R? and (U,) a local chart for X ; if f is the inverse map of y, in the open 
subset U, the surface X is given by the parametric representation 


c= fi(s,t), y= f7(3,0); g= f(s, 0); 


where (s,¢) vary in the open subset y(U) of the plane. Then the derivatives 
D,f and Dof of f with respect to s and t at each point « € U are two 
non-proportional tangent vectors to X at x (in the classical sense); their 
classical vector product is a normal vector to X at x, which is a continuous 
function of the point x and thus coherently orients the normals to X at 
points of U. Similarly, if (V,~) is another chart and if g = w~1, the product 
Dg Dog leads to an orientation of normals in V. In UNV, the D;g are linear 
combinations of the D;f whose coefficients are the entries of the Jacobian 
matrix of change of coordinate 0 : p(x) —> (x). As the vector product of 
two vectors is an alternating bilinear form of these, 


Dyg(n) A Dog(n) = Jo(§) Dif (§) A Daf(S) ; 


which shows that, in UV, the orientations of normals to X defined by 
the charts (U,y) and (V,~) are identical only if Jg > 0 everywhere. So if X 
can be covered by charts such that the chart change formulas have positive 
Jacobians, then the normals at all points of X can be coherently oriented, 
which is the classical definition of an orientable surface. 
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Conversely, suppose that this condition holds; among all charts of X, 
only consider those leading to the chosen coherent orientation of normals;°” 
the previous arguments then show that the relation Jg > 0 holds for any 
two of these charts, and the general definition of orientability given above is 
recovered. 

In the general case, which we now return to, consider two atlases (U;, y;) 
and (V,,%») with positive Jacobians. For x € X, choose indices i and p 
such that  € U;N V,; let e(x) = +1 be the sign at the point x of the 
Jacobian of the chart change y;(x) ++ wp,(x). It only depends on 2, since if 
x € U,V, for another couple of indices, the Jacobians of the chart changes 
(Ui, i) —+ (U;,~;) and (Vp, Vp) —> (Vq,%q) are positive by assumption. 
The result is then an immediate consequence of the multiplication formula 
for Jacobians. Having said this, observe that as the Jacobian of every chart 
change (Ui,~i:) —> (Vp, Wp) is a continuous function on the open subset 
U; Vp, its sign remains constant in the neighbourhood of every x € U;NV,; 
hence ¢(2) is a continuous function on X. The two atlases considered will be 
said to define the same orientation (resp. opposite orientations) if e(x) = +1 
(resp. —1) for alla € X. If X is connected, there is clearly no other possibility; 
in other words, there are at most two ways of orienting a connected manifold. 

Whether X is connected or not, an equivalence relation can be defined on 
the set of all charts for X by setting two charts to be equivalent if and only 
if the Jacobian of chart change is everywhere positive. This makes it possible 
to divide the charts into equivalence classes, each of these classes being an 
atlas of X with the following “maximality” property: given an atlas, if the 
Jacobians of change of coordinates between a chart for X and those in the 
atlas are all positive, then the chart belongs to it. Orienting a manifold then 
consists in choosing one of these classes; the charts which belong to it will 
then be said to be compatible with the orientation of X. 


The simplest manifolds being Cartesian spaces, the orientation problem 
already arises for them and, in the same way, for their open subsets; in this 
case, there is a purely algebraic definition of orientation. 

Indeed, let (a;) be a basis for the Cartesian space E; this immediately 
gives global chart for & (or for an open subset of F’) by associating to each 
x € E its coordinates with respect to this basis; it could be called the linear 
chart for E associated to the chosen basis. This chart being by itself an 
atlas of E, the signs of all the Jacobians of its changes of charts are easy 
to compute...Hence, F is orientable, and if this chart is used to orient EF, 
then the choice of a basis for E defines an orientation of E. Now let (b;) be 
another basis for E; there is another linear chart for E associated to it and 
the latter can be obtained from the former by formulas known by everyone. 
Hence the orientations defined by the two bases considered are identical or 


®7 Tf a chart (U, y) does not satisfy this condition, it suffices to compose it with the 
diffeomorphism (s,t) —+ (t,s) to obtain in the same open subset of X a chart 
compatible with the orientation of the normals. 
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opposite according to whether the determinant of the change of basis matrix 
is positive or negative. From this point of view, two bases can be considered 
to be equivalent if the determinant of the matrix taking one to the other is 
> 0. The set of all bases for FE is thereby divided into two equivalence classes, 
one which plays a distinguished role; orienting EF then amounts to choosing 
one of the these classes. The bases belonging to it are said to be direct, by 
analogy with physicists’ “ direct trihedrons”. There is a distinguished class of 
bases in the spaces R”: that of the canonical basis. Hence it is possible to 
choose a canonical orientation in R” — but it is impossible to do so in any 
other Cartesian space. 

The case of a general manifold X can be presented in the same manner. 
Let (U,y) be a chart for X. As seen in (i) of n° 13, at each point x of U, it 
gives rise to a basis (a;(€)) for X'(x) such that 


h=hi(€)a;(€) for all he X’(x). 


If (V,w) is another chart, then h = h®(7)b.(7) also holds in the basis corre- 
sponding to this chart. But as the formula h'(€) = p',(n)h*(n) whose coeffi- 
cients are the dé*/dn® transforms the h'(€) to h%(n), the determinant of the 
matrix taking the first basis to the second one is clearly the Jacobian of the 
change of basis at x (or of its inverse, which does not change the sign). Hence, 
if in every local chart (U,y), the basis (a;(€)) is used to orient the tangent 
space X'(x) at each x € U as done above to orient a Cartesian space, two 
such charts are seen to define the same orientation in UM V if and only if 
they orient X‘(x) in the same manner for all x € X. 

The conclusion is clear: orienting a manifold X amounts to orienting each 
tangent space X'(a) so that X can be covered by charts (U, y) satisfying the 
following condition: for all a € U, the orientation of X’(a) is defined by the 
basis (a;(€)) for X’(x) associated to (U, y). 


(ii) Integration of differential forms. These arguments show that in order 
to define the integral of a differential form w of maximum degree on a manifold 
X, X needs to be assumed to be oriented and w to have compact support. 
For lack of better, the arguments of (1) then show that the integral of w can 
given an absolute meaning in the particular case when the support of w is 
contained in an open subset U of X; apply formula (1) after having chosen 
a chart (U,y) compatible with the orientation of X ; the result is the same 
for all open Cartesian subsets U such that Supp(w) C U and for all possible 
diffeomorphisms ¢. 

However, the method supposes that the compact subset K = Supp(w) 
is contained in an open Cartesian set. In the general case, even that of a 
sphere in R®, only the existence of a covering of K by a finite number of such 
open subsets U; can be guaranteed using BL. The method then consists in 
constructing forms w; with compact support satisfying 


(17.2) Supp(w;) CU; & w= So wi 
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and in setting by definition, 


(17.3) fe=Xfiu 


since, in any reasonable interpretation of the integral of w;, it suffices to 
integrate over the open subset U; outside which w; vanishes. But to give an 
“absolute” meaning to the left hand side of (3), the right hand side still 
needs to be shown to remain invariant if the U; are replaced by the open 
Cartesian subsets V, covering Kk and the w; by the w, satisfying (2) for the 
new covering, which is not at all obvious. Partitions of unity need to be used 
for this, a technique that can prove useful elsewhere. 


Lemma 1. Let A and B be two disjoint closed subsets in a metric space X . 
There is a function f defined and continuous on X satisfying f(a) = 1 on 
A, f(x) =2 on B and1< f(x) < 2 everywhere. 


The lemma is trivial if A is empty (take f = 2 everywhere) or if B is 
empty (take f = 1 everywhere). Otherwise, choose a distance function d(x, y) 
defining the topology on X and set 


f(x) = inf [d(a, A), 2d(x, B)] / inf [d(x, A), d(x, B)| 


for « € X — (AUB), an open set on which f is continuous since so are the 
functions d(#, A) and d(x, B) that do not vanish there. In the neighbourhood 
of every point of A, d(x, A) < d(x, B) and so f(x) = 1; in the neighbourhood 
of every point of B, 2d(a, B) < d(x, A) and so f(x) = 2. Setting f(a) = 1 on 
A and f(x) = 2 on B, we find the function sought® in all of the space X. 


Lemma 2. Let U be an open set and A a closed one contained in U. There 
is an open set V such that ACV CV CU. If X is locally compact® and A 
is compact, V may be assumed to be compact. 


Here too, the first statement is trivial if U = X (take V = X) or if A 
is empty (take V = {x} with x € U). Otherwise, set B = X — U, choose a 
function f by lemma 1 and take V = {f(x) < 3/2}, an open set containing 
A trivially and whose closure, contained in the open set {f(x) < 3/2}, does 
not meet B= X—U;soV CU. 


68 A more general result (Urysohn’s theorem): if f is a real, bounded continuous 
function defined on a closed set F' C X, there is a continuous extension of f to 
X. Assuming f(F’) C [1,2], the case it reduces to, formula 


f(a) =d(a, F)~'. inf [f(u)d(a,u)] for « € X —F, 


where inf relates to the u € F’, provides a solution. Dieudonné, IV, 5. The lemma 
correspond to the case F = AU B. 

ie. such that every x € X has a compact neighbourhood V. Then any neigh- 
bourhood of x contains a compact neighbourhood, for example VM B, where B 
is a closed ball centered at x contained in W. 


69 
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If X is locally compact and A is compact, for every x € A choose an open 
neighbourhood V (a) of x, with compact closure V(a) C U; By BL, A can be 
covered by finitely many V(a;); their union V is the answer to the question 
since V = JV (zi) is compact and contained in U. 


Lemma 3. Let Up,...,Un be open sets and X their union. There exist open 
sets Vo,...,Vn whose union is X and such that V; C U; for alli. 


The V, are constructed by induction on p by requiring them to also satisfy 
VoU...U Vp U Up 41 UU... Un =X. 
As Vo only needs to satisfy 
X—(U,U...UUn) CV CV CU,, 


it is obtained by applying lemma 2 to A = X — (U, U...UU,,) and U = Up. 
Now if Vo,...,Vp—1 are constructed so that 


XS TZUVG Us. UA UT Uy, 


the V, are obtained by arguing in the same way about this new covering. 


Lemma 4. Let X be a locally compact metric space, A a compact subset 
of X and (Ui)i<i<n a finite open covering of A. Then, there exist positive 
valued continuous functions f; on X, with compact support and such that 


(17.4) Supp(fi) C Ui, yo =1 on A. 


Set Up = X — A. Lemma 3 gives open sets V;(0 < i < n) covering X 
and such that V; C V; C U;. For 0 <i <n, lemma 1 proves the existence of 
continuous functions g; on X, with values in [0,1], and such that 


(17.5) g(z)=1if ceV;, gi(x)=0 if re xX —-U;. 


Consider the function g = S¢ g;. It has > 0 values and is even > 0 on the Vi, 
hence on their union X. The functions h; = g;/g(0 < i < n) are, therefore, 
defined and continuous on X and satisfy 5° h;(x) = 1 for all « € X; but as 
ho = go/g vanishes on X — Uy = A, the functions h;(1 <i <n) have sum 1 
on A, each vanishing outside the corresponding open set U;. The h, still need 
to be transformed into functions with compact support. But as A is compact, 
lemma 2 shows the existence of an open set W with compact closure such 
that 


ACWCWCWU...UU, 


and lemma 1 that of a function p equal to 1 on A and vanishing outside W. 
Multiplying the h,; by p gives functions f; vanishing outside the U; and whose 
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supports, contained in W, are compact; their sum on A is obviously again 
equal to 1, qed . 

Thanks to Lemma 4, it is possible to give an unambiguous definition of 
the integral over an oriented manifold X of a differential form w of class 
C°, maximum degree n = dim(X) and with compact support. For this, in 
conformity with lemma 4, choose open Cartesian sets U; covering the support 
of w, as well as functions f; and apply formula (3) choosing w; = fiw, a form 
whose support is a compact subset of U;. If the U; and the f; are replaced by 
the V, and the gp satisfying the same condition, then obviously w; = 3 gp; . 
As the support of w; is a compact subset of U;, the integral of w; over U; is the 
sum of the integrals of g,w; since, as seen at the start of (ii), the situation 
in an open Cartesian set is similar to that in an open subset of R”. But 
the support of g,w; being contained in the open Cartesian space U; M Vp, its 
integral over U;, defined as in (i), is equal to its integral over U; 1 Vp. Hence 
finally, 


~f fe =3 fy, sob 


Now, permuting the roles played by the U; and the V,, similarly 


7 a 


follows. So the two partitions of unity used to compute the integral of w 
indeed lead to the same result. 
At the same time, if w and @ are forms with compact support, then 


(17.6) fot fer fore: 


the union of the supports of w and @ being compact, the same partition of 
unity can be used to compute the three integrals in question; this reduces to 
the trivial case of an open subset of a Cartesian space. 

Finally note that if w is a form of degree p < n and if Y is a p-dimensional 
submanifold of X, by inverse image, the immersion Y —+ X leads to a form 
of maximum degree on Y. If the support of w is compact and if Y is closed, 
then f,-w can be defined. 


18 — Stokes’ Formula 


It states that if w is a differential form of degree n — 1 on an n-dimensional 
oriented manifold X and if 2 is an open set with compact closure whose 
border 02 is an n — 1-dimensional submanifold of X in the sense defined at 
the end of n° 12, then 


(18.1) [es fe 
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obviously provided the edge of 92 is oriented properly and a further condition 
is imposed; its intuitive meaning is that in the neighbourhood of any of its 
border points, the open set 92 is located “on one side” of it. Corollary: if X 
is an n-dimensional compact manifold, then 


[dono 
x 


for any form w of degreen—1 on X, since OX is empty. 

Before moving on to the proof, we make some remarks about what hap- 
pens in the neighbourhood of a point 2 € O92. 

Since, by assumption, 02 is an n — 1-dimensional submanifold of X, as 
seen at the end of n° 12, there is a cubic chart (U,y) at x” such that 
(ao) = 0 and y(UN AM) is the subset of K” defined by relation €1 = 0. Let 
Up, Ux and U_ be the subsets of U defined by the conditions €' = 0, €' > 0 
and €! < 0, respectively. So Up = UN ON. Since Q does not meet its border, 
UN2 Cc U,UUL_ and, for the same reason, 


Uy NQ=U,N(QUAN). 


The intersections of 22 with U, and U_ are, therefore, both closed and open in 
these connected open sets, and so only two cases are possible: (a) UN.Q = Uz 
or U_, (b) UN Q=U,UU_. 


As will be seen, Stokes’ formula supposes that case (a) holds every- 
where. However, if one of these two cases holds at x9 € O02, then is also 
holds at « € O02 sufficiently near x9. The set S' of « € 092 where case (b) 
holds is, therefore, an open subset of 02. Similarly for the set S’ of points 
where case (a) holds. This gives a partition of O02 into two open sets, i.e 
into two closed ones. Hence, like 02, S and S’ are n — 1-dimensional 
compact submanifolds of X. Besides, S$ U 92 is clearly a connected open 
subset of X if Q is connected. Replacing X by SU, the following ques- 
tion arises: In an n-dimensional connected manifold, can the complement 
Q of ann —1-dimensional compact submanifold S be connected? If the 
answer was always no, S = @ would hold and the “good” case (a) would 
also hold. If X = R?, then S$ is a smooth curve without multiple points. 
Hence, it may be assumed to be the finite union of pairwise disjoint sim- 
ple closed curves, namely its connected components; the answer to the 
question is, therefore, no by Jordan’s theorem, which we alluded to with- 
out proof at the end of Chap. IV, §4; the same result holds for a sphere. 
But if we remove from the surface a two-dimensional torus, a circle whose 
plane contains the rotation axis of the torus, or is orthogonal to it, then 
case (b) holds: the complement of such a circle is connected and located 
on “both sides” of it. To have S = ©, two circles would need to be 
removed from the torus; the open complement then has two connected 


70 i.e. such that y(U) is the cube K” : |é*| <1 of R”. 
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components, and case (a) holds for each of them. Algebraic topology (du- 
ality theory) has long resolved and generalized this problem: the relation 
between the topology of a submanifold and that of its complement. 


In what follows, case (a) will be assumed to always hold. 02 is then said 
to be the boundary of 22, a more restrictive expression than “border”, and 
that Q is an open set with boundary in X. It may then be assumed that €1 > 0 
on UN 2, if need be by replacing the function €+ by —€! and by carrying 
out an odd permutation on the other coordinates, operations that leave the 
orientation of the chart considered and its cubic character unchanged. Then 

Having said that, the compact set 2U O02 can be covered by a finite 
number of cubic charts (U, y) compatible with the orientation of X and such 
that UN 2 = U or U,. Let G be the union of these charts; it is open in X. By 
lemma 4 above, in G, w can be decomposed into a sum of differential forms 
whose supports are compact sets contained in the charts considered. By the 
additive formula (15), it, therefore, suffices to prove Stokes’ formula for these 
forms. In other words, the support of w can be assumed to be compact and 
contained in one of these cubic charts (U, vy). 

Let us first consider the case UM 2 = U. As the support of w does not 
meet O02, the left hand side of Stokes’ formula is zero. To show that the same 
is true for the right hand side, consider the open cube y(U) = {\£'| < 1} of 
R”. Thus w is replaced by a form 


(18.2) S- pi(E)dét A... A det A... Ade”, 
where the accent indicates that dé’ must be omitted, and dw by 
(18.3) S0(-1) Dipi(é)dé! A... A dé”. 


To find the Lebesgue-Fubini integral of the i‘” term of (3), we can first inte- 
grate with respect to €", which gives the variation over ] — 1, +1], up to sign, 
of the function 


(18.4) th— Dj ( niggt Soe see) ; 


as the support of w is compact in the open set y(U), this function vanishes 
if t is sufficiently near 1 or —1. The result follows. 

The UM 2 =U, remains to be proved. Once again, there is a form (2) 
on y(U), but now we integrate its exterior derivative (3) over the open set 
€! > 0, so that integration with respect to the variable €’ must be extended 
to the interval ]|—1,+1[ if i A 1 and to the interval ]0,1[ifi = 1. Ifi £1, the 
result is zero as in case (a). If, however, i = 1, the result is the variation of (4) 
over ]0,1[; the support of dw being compact in y(U), function (4) vanishes 
for t in the neighbourhood of 1 as in case (a), but not in the neighbourhood 
of €' = 0; so, applying the FT, 
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(18.5) dw = -{ pi (GS ange) dé? ...dé”. 
x?) Kr-l 


This is the extended ordinary multiple integral over the cube K"~! of R"~!. 

As for the extended integral of w over 022, i.e. over UN OQ, it is computed 
by using the cubic chart (Uo, Yo) of the manifold 02, where Up = UNO as 
above and where 


~o(z) = (0, y? (2), or) y"(2)) : 


The image of Up under this chart is precisely the cube K"~ ‘appearing in (5), 
and yo transforms w into the form that can be deduced from (2) by replacing 
the €+ by 0, which cancels all the terms of (2) for which i 4 1. Hence 


(18.6) fem fi Pe Oar) ae de, 


with a sign ¢ depending on the orientation of 02, which has not yet been 
defined. Hence to obtain Stokes’ formula in this case, it is necessary to en- 
sure that ¢ = —1, in other words to orient 02 so that the chart (Uo, Yo) is 
incompatible with this orientation. 

The result can be stated differently. Let f be the inverse map of y. Con- 
sider the point rp = f(0) € UN OM and set a; = f’(0)e;, where (e;) is the 
canonical basis for R”. This gives a basis for X'(xo) defining its orientation 
since the chart (U,y) is compatible with it; besides, the vectors a2,..., Qn 
form a basis for the tangent subspace to Y = 02 at xo and define the ori- 
entation of 02 opposite to the one appropriate for Stokes’ theorem. Having 
said this, consider the curve 


y:tr> f(—t,0,...,0) 


in X passing through x9 for t = 0; The assumptions made about the chart 
(U,y) imply that for sufficiently small |t|, y(t) € Q for t < 0 and y(t) € 2 
for t > 0. The curve 7¥ is, therefore, the trajectory of a moving object coming 
out of 92 by crossing the boundary of 92 at xp at time t = 0; its velocity 
vector at t = 0 is —a,. The basis (—aj,a2,...,@,) being incompatible with 
the orientation of 92, it can be made compatible by an odd permutation of 
@2,..-,@n which transforms these vectors into a basis for Y’(x9) compatible 
with the orientation of 092. The rule to be applied can, therefore, be stated 
as follows: let hy € X’(x) be the velocity at x of a moving object coming out 
of 2 at the point x. A basis (he,...,hn) for the tangent space to 02 at x 
defines the orientation of 0 if and only if the basis (hy,...,hn) for X’(x) is 
compatible with the orientation of X. 

Consider the simplest case: X is a 2-dimensional submanifold in R°, ice. 
what physicists mean by a “surface”, 02 being a curve drawn in X and 
limiting an open subset 2 of X. Choose a basis (hi, h2) at a point  € 02 
for the traditional tangent plane T,,(X) defining the orientation of X and for 
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which hz is tangent to the curve limiting 2. If the vector h, “comes out of” 
2, 02 must be oriented like hg. But if (hi, h2) defines the orientation of X, 
this means that the orientation of its normal is that of the vector hy A ho. In 
other words, the vectors h,, hz and the unit vector of the oriented normal at 
x, written in this order, must form a “direct” trihedron. We thus recover the 
classical rule described in n° 9, (iv): a passerby following the boundary of 
2 in the direction prescribed by Stokes’ formula and remaining oriented like 
the normal to the surface, when looking in front of him or her, must leave 2 
on his or her left, which until further notice is the same for both genders. 

In particular, this rule applies to an open subset 2 of C limited by one or 
several closed simple curves yo,...,Yp, aS can be encountered in the the- 
ory of holomorphic functions; if 2 is assumed to be interior to yo exte- 
rior’! to y1,... »Yp and if 2 is given the usual orientation, then yo must 
be “positively” oriented (counterclockwise) and the other yz negatively. 

Suppose now that X = R°, which is another classical case, so that Q is an 
open bounded set whose border is a 2-dimensional compact submanifold of X, 
for example a sphere, a torus, etc. It is natural to orient 2 like the canonical 
basis for X. It then remains to orient 02 in such a way that Stokes’ (for that 
matter, Ostrogradsky’s) formula holds. The general result shows that if we 
choose a basis (hi, h2,h3) at x € OM, oriented like the canonical basis and 
such that (i) h, is the velocity vector at x of a moving object departing from 
x and come out of 92, (ii) (h2,h3) form a basis for the tangent plane to 02 at 
x, then it needs to be oriented like the basis (hz, h3). This amounts to saying 
that the normal vector hz A hs to 092 at x must come out of (2. 


™ Tn the sense given to these terms in Chapter VIII: Ind y(z) = +1 or —1 according 
to whether z is interior or exterior to ¥. 


X — The Riemann Surface of an Algebraic 
Function 


1 —Riemann Surfaces 


Let X be a 2-dimensional C° manifold in the sense of Chap. IX, n° 11, (ii). 
If (U,y) is a chart for X, y may be considered a homeomorphism from the 
open set U onto an open subset of C: if & (x), &o(a) are the coordinates of 
v(x) € R? for x € U, it suffices to agree that y(x) = 1 (a) + i€(2). 

Let (U,y) and (V,~w) be two charts for S. The change of charts p(UNV) 
— W(UNV) is a priori not C°, so that if, by miracle, a function defined on 
UV can be expressed holomorphically in (U, vy), there is no reason this is 
also possible in (V, w). For this, the change of charts would need to transform 
holomorphic functions into holomorphic functions, in other words would need 
to be a conformal representation of p((UNV) on W(UNV). 

This brings us to the notion of a Riemann surface (or of a complex ana- 
lytic manifold with complex dimension 1): It is a 2-dimensional connected! 
manifold X of class C° with an atlas (U;,y;) all of whose changes of charts 


pi(Ui 1U;) —> 9;(UiN U;) 


are holomorphic, in which case these are conformal representations (permute 
i and j). This leads to the more general definition of a holomorphic chart 

(U, y) of X by the condition that, for all i, coordinate changes y;(x) +> v(x) 
and y(%) + vy; (a) be holomorphic on the open sets on which they are defined. 
When (U,) is a holomorphic local chart at a € U such that y(a) = 0, the 
function y is said to be a local uniformizer at a; for this case, some authors 
adopt the notation gq instead of y. We will also occasionally do so despite the 
fact that it could give the wrong idea that qq is determined by a. Holomorphic 
functions being C'°°, a Riemann surface is first of all a C°° manifold. 

The most obvious example, apart from that of an open subset of C, is the 
Riemann sphere C = C U {oo}: it has an atlas with two charts (U,y) and 
(V,w), where U =C, y(z) =z, V =C— {0}, o(z) =1/z. 

A more useful case as it controls the theory of elliptic functions consists 
in choosing a lattice L in C, i.e. a discrete subgroup generated by two non- 
proportional numbers w; and w2 (Chap. IJ, n° 23) and in considering the 


' Not assuming this would lead to ridiculous complications. To start with, theorem 


1 below would become false. 
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quotient space X = C/L of classes mod L. Equip it with the obvious topol- 
ogy: U C X is open if and only if its inverse image is open in C or, what 
amounts to the same in these circumstances, if it is the image of an open 
subset of C; X becomes a compact space homeomorphic to the torus T?. To 
define a complex structure on X, first note that if D C C is a sufficiently 
small open disc centered at a, its images under the translations zH z+w, 
with w € L, are pairwise disjoint, so that p: C —> C/L maps D homeomor- 
phically onto an open subset U of S'. If y denotes the inverse map U — D 
to p, the couple (U,y) is a chart of X, and to find the analytic structure 
sought, it suffices to show that the charts thus defined satisfy the condition 
imposed above, which is obvious. The function y(a) —a, where a is the centre 
of D, is then a local uniformizer at the point p(a). 


If X is a Riemann surface, for any open subset O C S, it is possible to 
define functions h, which will be said to be holomorphic (resp. meromor- 
phic ) on O, by requiring that they satisfy the following condition: for any 
holomorphic chart (U,y), there exists a holomorphic (resp. meromorphic) 
function hy, on the open subset y(U MO) of C such that h(x) = hyglyp(z)| 
for all « € UM O; checking it for the charts in an atlas would be enough. 
An equivalent definition: in the neighbourhood of every a € U M O, the func- 
tion h is the sum of a power series (resp. Laurent series with finitely many 
terms of degree < 0) in y(x) — y(a): 


(1.1) h(x) = > en [o(a) — p(@)]” = So enga(z)”. 


Hence, like in C, a meromorphic function on O is not defined everywhere, 
unless it is assigned the value oo at all points of a discrete subset of O. 
Denote by H(O) (resp. M(O)) the set of holomorphic (resp. meromorphic) 
functions on O. Their defining property is clearly of a local nature. Like on 
C, the usual algebraic operations, including division, can be performed on 
meromorphic functions on X: if f is not the function 0, its zeros are isolated 
since X is connected, which removes all difficulties. To be correct, the result 
should also be defined at points, where, apparently this is not the case: if for 
example f and g have poles at a € X and if their polar parts in their Laurent 
series (1) cancel mutually, a value is assigned to f + g at a, namely the sum 
of the constant terms of their Laurent series. Therefore, the set M(X) of 
meromorphic functions on X can also be considered a commutative field. 

An essential difference between C'°° theory and that of Riemann surfaces 
is the existence of holomorphic or meromorphic functions on a given open set 
O, while at the same time it being clear that if O is contained in the domain 
of a chart, then this is not obvious in the other cases. Showing — which we will 
not do here — that there are many meromorphic functions defined globally on 
a Riemann surface is the first major difficulty encountered as we progress in 
the theory. 

This is not surprising. If, as above, we consider a lattice L in C and 
the Riemann surface X = C/L, holomorphic or meromorphic functions on 


1 — Riemann Surfaces 277 


an open subset O of X are those which, composed with p : C —> C/LZ, 
are holomorphic or meromorphic on the open subset p~!(O) of C. They are 
invariant under translations zH z+w,w © L. Any global existence proof 
of meromorphic functions on C/L, therefore, shows, without the slightest 
calculation, the existence of elliptic functions , i.e. of meromorphic functions 
on C invariant under translations z+ z+w. This result cannot be made 
trivial: either you prove the theorem holds for all Riemann surfaces, or else, 
like Weierstrass, you write series 


Si/(e-w)*¥, &=4,6,8,... 


wel 


answering the question (Chap. II, n° 23). 

Let us now define the order vq(h) of a meromorphic function h at a 
point a. For this choose a local uniformizer q, at a. This gives a Laurent 
series expansion h(x) = > c,q"(x) in the neighbourhood of a. vg(h) is then 
the smallest integer for which c, 4 0. This definition does not depend on the 
choice of the chart. 

Let w be a complex-valued C™ differential form of degree 1 on X (or 
more generally on an open subset O of X: replace X by O). Let (U,y) be a 
local chart of X, understood to be henceforth always holomorphic. Setting 
p(x) = ¢, (U,¢) transforms w into a form w, on y(U) which can be written 


(1.2) Wy = he(O)d + ky (C)de, 


with C° functions h, and k, depending on the chart considered. w will 
be said to be holomorphic if, for every chart (U,y), wy = hy(¢)d¢ with a 
holomorphic function hy on y(U) or, equivalently, if w is the inverse image 
under y of a holomorphic differential form on y(U). If (V, w) is another chart, 
changes of charts 


6: p(UNV) > wW(UNV), p:vVUNV) > epUNV) 


clearly (transitivity of inverse images) transform w, into wy and conversely; 
hence 


Wye = hy(C)de => wy = hy [p()] d[o(0)] = hy Lo) oS) ac , 


so that the coefficient h, of w in the local chart (U, y) is transformed by 


(1.3) hy (G) = hy [o(Q)] 0°(0) - 


This is a very particular case of tensor calculus formulas. Like in C, dw = 0. 

More generally, meromorphic differential forms can be defined on X : these 
are holomorphic forms on X — D, where D is a discrete subset of X and such 
that, for all a € D wy, = hy(¢)d¢ in a sufficiently small (hence in any) local 
chart (U,y) at a, where h, is meromorphic on y(U) and has a unique pole 
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at the point y(a). If for example h is a meromorphic function on X and if 
h is is represented by a meromorphic function h,(¢) in (U,y) then, by the 
multivariate chain rule, its differential dh is represented by h{,(¢)d¢; therefore, 
dh is a meromorphic differential form with the same poles as h. The product 
of a differential meromorphic form w and of a meromorphic function is defined 
in an obvious way. For example, for any meromorphic function h, consider 
the form dh/h whose poles, like in C, are clearly the poles and zeros of h. 
For a differential form w = h,(¢)d¢, where h, is meromorphic on y(U), 
there is an expansion h,(¢) = > en [¢ — y(a)]” in the neighbourhood of all 
a € U; by definition, the coefficient c_, is the residue of w at a, written 
Res(w, a). It also is independent of the choice of the chart [Chap. VIII, n° 5, 
(v): invariance of the residue under conformal representation]. For example, 
take w = dh/h, where h is meromorphic on_X; if h is represented in the chart 
(U, y) by a meromorphic function h,(¢), then w is clearly represented by the 
form hi,(¢)d¢/h,(¢) ; w is, therefore, a meromorphic differential form, and 


(1.4) Res, (dh/h) = va(h) 


as in C. 
We now prove a result generalizing what has been seen in Chap. 8, n° 5 
about functions defined on the Riemann sphere: 


Theorem 1. Let X be a compact Riemann surface. 


(a) Any function defined and holomorphic on X is a constant. 
(b) For any meromorphic function f on X, 


(1.5) afy= > vf) =. 


(c) For any meromorphic differential form w on X, 
(1.6) S- Resa (w) = 0. 


(a) is obvious: a function h everywhere holomorphic reaches its maximum 
somewhere, and so is constant in the neighbourhood of its maximum. As X 
is by definition connected, classical arguments apply verbatim. (Corollary: 
the only entire elliptic functions on C are the constants). 

To obtain (b), it suffices by (4), to prove (c). It will follow from Stokes’ the- 
orem. The latter can be applied since the Jacobians of holomorphic changes 
of charts are > 0, making it possible for X to be oriented by these charts. 

Having said this, X being compact and the poles of w being isolated, the 
latter are finite in number; let us denote them by az(1 < k < n). For each 
k, choose a local chart (Uz, y~%) such that yp(a,) = 0 and, for sufficiently 
small r > 0, denote by Dx(r) the set of  € U;, such that |y;,(a«)| < r. These 
“discs” are closed in X and if r is sufficiently small, also pairwise disjoint; 
then w has a unique pole at a, in D;(r) and even in D,(r’) for some r’ > r. 
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Next consider the open subset G obtained by removing the discs D,(r) 
from X. Its border is the union of the “circles” limiting the “discs” D,(r). 
In the neighbourhood of a border point of Dy(r) C D,(r’), the situation is 
similar to that of two concentric discs in C. It is, therefore, clear that, on the 
one hand, the border of G is a (real) one-dimensional submanifold of of the 
union S of circles |yx(a)| =r, and that, on the other, in the neighbourhood 
of a border point of G, the open subset G is located on only one side of its 
boundary. 

w is holomorphic and a fortiori C™ on the manifold X’ = X—{ay,...,an},. 
G is open in X’, with compact closure in X’ since the latter being the com- 
plement in X of the open discs |yx(x)| <r, is closed in the compact set X 
and does not contain any ax. Stokes’ theorem can, therefore, be applied: 


ea 


dw = 0 since w is holomorphic, so that the sum of the integrals of w along 
the “circles” limiting D;,(r) is zero. 

Using the chart (Ux, yx) which transforms w into a form wp = he(¢)d¢ 
in the disc |¢| <r’, it becomes clear that (Uz, yx) transforms the orientation 
of X, hence of X’, into the standard orientation of C; it also transforms 
the boundary of Dz(r), oriented in conformity with Stokes’ formula, into the 
circle |¢| = r oriented counterclockwise. So the diffeomorphism y;, transforms 
the extended integral w over the boundary of D;(7) into the extended integral 
of wz over the circumference |¢| = r, equal to 277 Reso(h,). But the residue 
of hy at 0 = yx (ax) is, by definition, the residue of w at ay. The theorem 
follows. 


Statement (b) has important corollaries. First, if f is a meromorphic 
function on X, for any c € C, the functions f and f —c clearly have the same 
poles with same multiplicities. In conclusion, as }>va(f) is the difference 
between the number of zeros and the number of poles of f (counted with their 
multiplicities), for a compact surface, the number of solutions of f(x) = c is 
independent of c and equal to to the number of poles of f. 

On the other hand, if w is a meromorphic form on X and if (U,y) is a 
holomorphic local chart at a, with y(a) = 0, the image w, = hy(¢)d¢ of w in 
y(U) is meromorphic on y(U). The order v,(w) at a is then defined by the 
relation 


Va(W) = U9 (hy) - 


Since the coefficient p’(¢) in (3) is holomorphic and # 0 at a, the definition 
does not depend on the choice of the chart(U, py). The a € X, where vg(w) 4 0 
form a discrete and hence a finite set if X is compact. Obviously, vag(hw) = 
Va(h) + va(w) for every meromorphic function h on S. Hence setting 


vw) = SF va (w) 
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for any meromorphic form on X, v(hw) = v(h) + v(w) = v(w) for all mero- 
morphic functions h. But if w and w’ are two meromorphic forms, there is is 
a meromorphic function h such that w’ = hw. This is obvious in any local 
chart (h is the ratio between the coefficients of w and w’), and the functions h 
obtained in the local charts can be “glued” to give a globally defined func- 
tion because of the transformation formula (3) which is identical for w and 
w'. The conclusion is that the integer v(w) is the same for all meromorphic 
differentials on S. Set 


u(w) = 2g — 2, 


where g is the genus of the compact Riemann surface X. 

We prove that g is an integer > 0 and, what is far less obvious, that two 
compact Riemann surfaces are homeomorphic? if and only if they have the 
same genus. The Riemann sphere has genus 0. Indeed, the differential form 
w = dz has a double pole at infinity since it must be computed by using the 
local uniformizer ¢ = 1/z, and as, at alla € C, w = l.d(z—a) with 1 40, 
the pole at infinity is the only contribution to the calculation of v(w); hence 
v(w) = —2 and g = 0. Conversely, any compact Riemann surface of genus 0 is 
isomorphic (and not only homeomorphic) to the Riemann sphere. For g = 1, 
we get the quotients C/L of the theory of elliptic functions; this classical case 
will be studied in volume IV. For g > 1, X is homeomorphic to a sphere with 
g handles. 


2 — Algebraic Functions 


Riemann imagined his surfaces in order to study algebraic functions of one 
variable and in particular, to make them uniform, though in his work they 
were far less clearly defined than here. To understand this, what is meant 
by an algebraic function ¢ = F(z) of a complex variable z should be first 
understood. The foremost characteristic of an algebraic “function” is that it 
is not a function: like Log z, which is not algebraic, or like z!/3 or 


[2 +1)°? - (2-22 41") 


that are so, it can take several values (an infinite number in the first case, 3 
in the second and 24 in the third) for a given value of z; the notation F(z) 
can, therefore, only represent one set of complex numbers, the only notation 
making sense being ¢ € F(z) and not ¢ = F(z). By definition, the elements of 
F(z) are the roots (possibly including 00 as will be seen later) of an equation 


(2.1) P(z,¢) =0, 
? But not isomorphic as complex manifolds (an isomorphism being a holomor- 


phic homeomorphism whose inverse is also holomorphic). The classification of 
Riemann surfaces, up to isomorphism, is far more complicated. 
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where 


(2.2) P(X,Y) = So ap XPY4 = Py(X)¥" +... + Py(X) = 
= Qo(VY)X" +...+ Qm(Y) 


is a given polynomial? in two variables or “indeterminates”, with complex 
coefficients, for example Y* — X in the case of z!/° and an equation of degree 
24 in the third example.* If Pp is not identically zero, in which case F will 
be said to be of degree n (it would perhaps be better to say 1/n to avoid 
confusion with the degree of a polynomial), equation (1) has at most n roots 
and, if P9(z) # 0, then for given z, the number of possibly multiple roots is 
exactly n. 


Fig. 2.1. 


Equation (1) has multiple roots for the values of z for which the equations 
P(z,¢) = 0 and D2P(z,¢) = 0 have common roots at ¢, where Dz is the 


3 In what follows, assuming that P is irreducible, i.e cannot be non-trivially written 

as the product P = QR of two other polynomials, will prove useful since if 
that were the case, equations Q = 0 and R = 0 would need to be considered 
separately ; as will be seen, this is also a necessary condition for the Riemann 
surface that will be constructed to be connected. Any polynomial P is a product 
of irreducible factors: consider a factor Q of minimum total degree and apply 
an induction argument on the total degree of P. The total degree is the largest 
integer d such that a,q 4 0 for a couple (p,q) such that p+ q = d. 
To compute it, construct a polynomial in Y having as roots the six differences 
u—v, where u = «(X24 1)'/? with « e{1,—-1} and v = w(X? — 2X 4-1)", 
where w is a cubic root of unity. The result (it can a priori be shown) is then seen 
to be a polynomial p(X, Y) — the “irrationals” disappear since the coefficients 
of p are the elementary symmetric functions of the six differences u — vu. The 
equation sought is then p(X, Y“) = 0. 
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differential operator with respect to ¢. The classical result from algebra’ — 
that the set of these values of z is finite — follows. For the equation ¢?(z? — 
1) —z° = 0, whose graph in R? has a cusp point at the origin, that is the case 
if z = 0,1 or +1; for z in the neighbourhood of 0, the equation obviously has 
two roots in the neighbourhood of 0, and though they can be distinguished by 
their sign in the real domain, this is impossible in the complex one. Indeed, 
following by continuity one of the roots along a circle centered at 0, we end 
up with the opposite root since the number z°/(z? — 1) with ¢ as “the” 
square root describes a curve around the origin. In the particularly simple 
case when P(X,Y) = Y? — X, which corresponds to ¢ = z!/?, the curve 
does not have any singularity at the origin, but its tangent at 0 is vertical 
and the conclusion is the same: the “function” z!/? is not well-defined in the 
neighbourhood of 0 as already explained in Chapter IV. 

In what follows, S will denote the finite set of z € C where equation 
(1) has at most n distinct roots, either because its degree decreases (points 
canceling Po), or because it has multiple roots (critical points)). 

If z and ¢ are called canonical coordinates in C?, equation (1) defines a 
complex algebraic curve in C? (in contrast to real algebraic curves in R?). 
In set theoretic language (Chapter I, n° 5), F is a correspondence® between 
C and C whose curve is the graph. To transform F into a function F in 
the strict sense, it suffices to consider the curve and to set F'(z,¢) = ¢, as 
was done in §4 du Chap. IV regarding the complex logarithm. Ignorance of 
this simple procedure, and more generally of the abc of “abstract” set theory 
has long confused matters, and not only about algebraic functions. I explains 
why somewhere, Dieudonné qualified classical discourses on “ multiform func- 
tions” as verbosity without, however, going as far as explaining to his readers 
how to transform this verbosity into perfectly correct mathematical argu- 
ments. The main purpose of the theory is to construct a compact Riemann 
surface on which z, the “function” ¢ = F(z) and more generally any rational 
expression in z and ¢ become genuine meromorphic functions. The graph of 
F is only a first approximation in the construction, which, as we will see, is 
considerably more difficult. 


The first step consists in considering the open set B = C — S defined by 
the condition 


2€B<=> CardF(z) =n 


> If P(Y) and Q(Y) are polynomials of degrees p and q with coefficients in an 
integral ring, for example C[X], and if y is a common root of P and Q, successive 
multiplication of P(y) by 1,y,...,y%~' and of Q(y) by 1,y,...,y?~' give p+q 
linear equations homogeneous in y”(0 <n < p+q-—1); as this system admits a 
non-zero solution (since 1 4 0), its determinant, a polynomial in coefficients of 
P and Q, must be zero. Do the calculations for p = q = 2. 

° Besides this term was used in algebraic geometry long before the invention or 
propagation of set theory. 
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and in showing that the subset X of the graph of F located over B admits 
a natural complex analytic structure (in other words, is a non-compact Rie- 
mann surface) for which p: (z,¢) H z and F’: (z,¢) + ¢ are holomorphic. 
The compact Riemann surface sought will then be obtained by adjoining a 
finite number of points to X, an operation analogous to the construction of 
the Riemann sphere C from C. 

Let us first prove the following result : 


Lemme 1 (Continuity of the roots of an algebraic equation). Let a 
be a point of C anda € C a multiple root of order p of P(a,¢) = 0. There 
exist numbers r > 0 and p> 0 such that, for all z satisfying 0 < |z—a| <r, 
the equation P(z,¢) =0 has exactly p roots, all simple, in 0 < |¢ —al < p. 


As seen in Chap. VIII, n° 5, (viii) in a more general context, if the given 
p > 0 is sufficiently small, the number v(z) of roots of P(z,¢) = 0 satisfying 
\€ —a| < pis a continuous function, with respect to the topology of compact 
convergence, of the function P, : ¢ ++ P(z,¢). But since P is a continuous 
function of the couple (z,¢), hence uniformly continuous on every compact 
subset of C x C, P, is a continuous function of z with respect to the topology 
of compact convergence; 1(z) is, therefore, a continuous function of z, and so 
v(z) = v(a) in the neighbourhood of a. Hence, for sufficiently small |z—a] < r, 
equation P(z,¢) = 0 has exactly p not necessarily distinct roots such that 
\¢ —a| < p. There being finitely many values of z for which the equation 
P(z,¢) = 0 has a multiple root, the only one of these values satisfying |z—a| < 
ris aif r is sufficiently small. The p roots of P(z,¢) = 0 such that |¢—a| < p 
are, therefore, simple if 0 < |z — a| < r, ged. 

It will be seen later that # decompose into n uniform branches f;,(z) in 
every simply connected open subset U of B, a uniform branch in U being, by 
definition, a genuine function f defined and (for the moment) holomorphic 
on U, such that f(z) € F(z) for all z € U; this is what had been proved in 
§ 4 of Chap. IV for the pseudo-function Log z for U Cc C*. 

A local result first needs to be proved: 


Lemma 2. Let P(X,Y) be a polynomial with complex coefficients and E C 
C? the set of simple points’ of the curve P(z,¢) =0. Then E is a submanifold 
of C? = R* and, for all (a,a) € E where D2P(a,a) # 0, there are open 
neighbourhoods V of a and W of a such that EM (V x W) is the graph of a 
function ¢ = f(z) defined and holomorphic on V with values in W. 


Equivalently, there exists a unique holomorphic function f on V satisfying 
(2.3) f(ia)=a & Piz, f(z] =0 forall zeV. 


f is called a local uniform branch at a of the algebraic function defined by 
Ps 

Since a is a simple root of P(a,¢) = 0, there are (lemma 1) neighbour- 
hoods V and W of a and a such that, for all z € V, the equation P(z,¢) = 0 


” i.e. points where D,P(z,¢) and D2P(z,¢) are not both zero. 
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has exactly one root ¢ € W. We need to show that this unique root ¢ is a 
holomorphic function of z. 
Regard the map 


P: (z,€) + P(z,¢) 


from C? to C as a map from R* to R?. At each point, it has a tangent linear 
map P’(z,¢) which is C-linear since P is holomorphic at z and ¢, namely 
(Chap. IX, formula (2.24)) 


(2.4) (h,k) 3 Di P(z,Q)h+ DoP(z, Qk, 


where h,k are vector variables in C and where the derivatives are taken to 
be in the complex sense; (4) shows that P’ is surjective at every point of 
the open subset Q of C?, where D,P and D2P are not both zero, so that 
P.: Q —+ R? is a submersion. Hence, every equation P(z,¢) = ¢ defines a 
closed submanifold of 2 having dimension 4 — 2 = 2 [Chap. IX, n° 13, (ii)], 
hence a submanifold of R* = C?; this is in particular the case of E. The 
tangent vector space to E at a point (a,a) € E where D2P #0 is the set of 
(h, k) such that 


D,P(a, a)h + D2P(a, a)k = 0, 
i.e. such that 
k = —D,P(a,a)h/D2P(a,a); 


hence the manifold EF is the graph of a map ¢ = f(z) in the neighbourhood 
of (a,a), where f is C®, with a tangent linear map given by 


f'(a)h = —D,P(a, a)h/D2P(a, a) 


(Chap. IX, n° 13, (iv)]. This formula shows that f’(a) is C-linear, and so f 
is holomorphic, qed. 

Similarly, E could be shown to be the graph of a holomorphic function 
z = g(¢) in the neighbourhood of a point (a,a) where Di P 4 0. 


Returning to the Riemann surface X, i.e. the graph of F over B, consider 
some a € B = p(X). Denoting by ax the n simple roots of the equation 
P(a,¢) = 0, there are holomorphic functions f,(z), 1 << k <n in the neigh- 
bourhood of a, satisfying 


(2.5) Plz, fr()|=0, fe(a) = ap. 


If D is a sufficiently small disc centered at a, these n local uniform branches 
fr at a are all defined on D and, being continuous, are pairwise distinct at 
all z € D since so are the f,(a) = ax. Denoting by Dy, C X the image of 
D under z +> (2, fx(z)), the Dz are seen to be pairwise disjoint and to have 
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as their union the set p~!(D) of points of X projecting onto D. The maps 
p: Dy — D and fr, : D — Dz, being continuous and mutually inverse, p is 
a homeomorphism from D; onto D, so that the D; are connected by arcs like 
D. Finally, Dy, is open in X since it is the set of (z,¢) € X satisfying z €¢ D 
and ¢ € Wx, where W;, is an open neighbourhood of a;. As a result, the 
Dx are the connected components of p~'(D) and p is a local homeomorphism 
from X onto B. 

On the other hand, for all k, the couple (Dx, /p) is clearly a local chart 
of X; we thus obtain an atlas for X, a priori O°. It is in fact holomorphic. 
Indeed, let (a,a), (b, 3) be two points of X and f,g be the local uniform 
branches at a and b such that f(a) = a, g(b) = 8. They are defined on the 
discs U and V centered at a and b and map them homeomorphically onto 
open neighbourhoods f(U) and g(V) of (a, a) and (b, 6). If f(U)Ng(V) 4 2, 
f and g are equal at least at one point of UM V, hence on all of UN V 
(uniqueness of uniform branches on a connected open set), and the change of 
charts taking the local chart (f(U),p) to the local chart (g(V), p) transforms 
the coordinate p(z,¢) = z of a point of the first chart into its coordinate 
p(z,¢) = z in the second. The change of coordinates is, therefore, the map 
z+» z, which is as holomorphic as possible. The conclusion that follows from 
these arguments is that there a complex analytic structure on X turning X 
into a Riemann surface. It is not yet compact, but it is a start. 

To go from here to the existence of global uniform branches on every 
simply connected open set contained in B and to complete X to obtain a 
compact Riemann surface X associated to P, it is helpful to develop some 
aspects of general topology that can also be useful elsewhere. 


3 — Coverings of a Topological Space 


As this theory requires quite a few explanations, I will break it down into 
several parts and confine myself to the essential minimum.® In particular, I 
will not mention the notion of a fundamental group of a space as it is not 
needed to construct Riemann surfaces of algebraic functions. 


(i) Definition of a covering. The notion of a covering space of a topological 
space (separated, i.e. satisfying Hausdorff’s axiom) generalizes the situation 
encountered at the end of the previous n°. Take two separated spaces X and 
B and a continuous and surjective map p: X —> B;; although this is not 
always necessary, we will suppose that the “base” B is connected® and locally 


8 This section follows quite closely Chapter XVI.28 in Dieudonné’s Eléments 
d’analyse, which follows even more closely what N. Bourbaki has written on the 
subject when it was on its agenda in the 1950s. As at the time, many homotopy 
experts belonged to the group, to start with, Samuel Eilenberg and Jean-Pierre 
Serre, and other people who had seriously thought about the subject, it is un- 
likely that a anything better could be achieved. 

° In all of this n° and in the rest of this §, “connected” will mean arc-connected 
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connected (i.e.there are arbitrarily small connected neighbourhoods of all 
points z € B, which is for example the case with manifolds). To encourage the 
reader to compare the general theory to arguments already used for complex 
variables in the previous n° and in § 4 of Chap. IV, we will denote an arbitrary 
point of B by z and an arbitrary point of X by ¢. 

In order for the triplet (X,B,p) to be a covering, we compel it to be 
locally trivial, more precisely : 


(R) every point z € B has an open neighbourhood D whose inverse 
image p~'(D) is the union of a family (D;);c7 of pairwise disjoint open 
sets mapped homeomorphically onto D by p. 


The most obvious consequence of (R) is that p transforms every neighbour- 
hood of a point ¢ € X onto a neighbourhood of p(¢). In particular, p trans- 
forms every open subset of X onto an open subset of B. Moreover, for all 
z € B, the fiber p-'({z}) of z in X is a discrete subset of X since its inter- 
sections with the open subsets D; reduce to a point. 

Example 1. Choose B = C*, X =C and p to be the map 


e:¢ > exp(277¢) 


from X onto B. As e/(¢) 4 0 everywhere, p is a local homeomorphism 
(Chap. VIII, n° 5, theorem 7 applied locally). If a € B is the image of 
some a € X and if D C B is a sufficiently small disc centered at a, then 
there is neighbourhood D’ of a homeomorphically mapped onto D by p. The 
periodicity of the exponential function shows that 


p\(D)=(JD'+n=(JDn, 
Z 


and if D’ (i.e. D) is sufficiently small. the translates D, of D’ are pairwise 
disjoint and homeomorphically mapped onto D by p. Hence C can be re- 
garded as a covering space, moreover simply connected, of C*. If we choose a 
lattice L of periods like in the theory of elliptic functions, then C becomes a 
simply connected covering space of the torus C/L. Finally, the map t +> e(t) 
transforms R into a covering space of T = R/Z. 

Example 2. For a given integer k > 0, consider the quotient P/kZ of P 
by the group of horizontal translations ¢  ¢ + nk, where n € Z, with the 
obvious topology. The function e(¢/k) is invariant under these translations, 
and so defines a continuous map p: P/kZ —>+ D*. It is more or less obvious 
that we thus obtain a “k-sheeted” covering space of D*, as it used to be 
called earlier, i.e. the canonical covering of order k of D*. It can also be 
constructed, up to isomorphism, by using the map z +> z* from D* onto D* ; 
axiom (R) holds by lemma 1 of n° 2 applied to the polynomial Y* — X for 
a = 0; in fact, the canonical covering of order k of D* is just the subset of 
the Riemann surface of the polynomial Y* — X located over D*. 

It may happen that condition (R) holds for D = B. If every ¢ € D; is 
identified to the couple (z,i), where z = p(¢), X is transformed into the 
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Cartesian product Bx I equipped with the product topology induced by the 
topology of B and the discrete topology of I (any subset of J is an open set); 
conversely, the product X = B x I, where I is a discrete space, is a covering 
space of B thanks to the map p(z,7) = z. Such a covering space is said to be 
(globally) trivial or a decomposition. 

In the case of algebraic functions (see end of the previous n°), there is a 
“n-sheeted” covering space of B, but as shown by the graph «x of the pseudo- 
function ¢ = Log z studied in Chapter IV, §4 in the general case, the number 
of Djs is not necessarily finite or even constant if B is not connected.!° Here 
B = (C*, X is the set of couples (z,¢) € C? such that z = exp(¢), and 
p(z,¢) = z. In any disc D C C*, the “multiform function” Log z decomposes 
into uniform branches L;(z) depending on & € Z and the graphs Dy, C 
X of these Ly, are the connected components of p~'(D). As an aside, note 
that this situation is just the same as that in example 1: associating the 
point (e(¢),¢) of the graph of Log z to all ¢ € C, we get a homeomorphism 
from the first covering space onto the second one which commutes with the 
corresponding maps p. Two such covering spaces of a same space B are said 
to be tsomorphic.. 

In the general case, if condition (R) holds in a neighbourhood D of z, 
it obviously also holds in any neighbourhood D’ Cc D. Hence if B is locally 
connected, D can be assumed to be connected; then so are the D;. Since 
the D; are pairwise disjoint, they are the connected components of the open 
subset p~!(D) of X. 

In practice, all the spaces considered are metrizable. If the topology of B 
is defined by a distance d(x, y), then, for all z € B, there exist numbers r > 0 
such that X is trivial over the ball D(z,r). If r(z) denotes the upper bound of 
these r, X is clearly trivial over D(z,r) for all r < r(z). If d(z, 2’) <r < r(z), 
X is trivial over D(z’,r’) for all r’ < r(z) —r since D(z’,r’) C D(z,r). In 
conclusion, r(z’) > r(z) — d(z,z’) and hence the function r is lower semi- 
continuous (Chap. V, n° 10). For any compact subset K C B, 


(3.1) inf r(z) =r(K)>0 


because there exists z € K, where r(z) is minimum (same reference). 


(ii) Sections of a covering space. Since p: D; —+ D is a homeomorphism, 
we can consider the inverse map y; : D — D;; it satisfies po y; = id; it 
is the analogue of a local uniform branch. More generally, if E is a subset 
of B, a section of X over E is any continuous map y : E —> X such that 
p|y(z)] = z for all z € FE, in other words, the analogue of a uniform branch on 
E. Since p is a local homeomorphism, there are sections over all sufficiently 
small neighbourhoods of all a € B, namely the y;; there is even a section 


10 Axiom (R) shows that the number of elements of p~'({z}) is a locally constant 
function of z € B, and so is constant if B is connected. This number, whether 
finite or not, is generally call the order of the covering space (X, B, p). 
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taking a given value at a given point of D . If two sections y and ~ defined 
on the same set EC B are equal at a € E, they are equal to the same y; at 
a: since D; is open in X, the values of y and w in the neighbourhood of a 
are in D;, and so y(z) = ~(z) = yi (z) for all z € E sufficiently near a. 

The set of points of EF where vy and w are equal is, therefore, open in E; 
it is also closed since y and w are continuous. As a result, two sections of X 
over a connected subset of X are identical if they are equal at some point. For 
example, if F is an algebraic function and if there are two uniform branches 
f and g of F on a connected open subset U C C — S (using the notation of 
the previous n°), then the existence of some a € U where f(a) = g(a) implies 
that f = g is all of U. So there are at most n uniform branches on U if 
n = d°(F) and, in fact, exactly n if U is simply connected (n° 4, theorem 6). 

Let us now show that if y is a section of X over a connected subset E 
of B, then p(E) is a connected component of p~'(E). As (p~!(E), E,p) is 
obviously a covering of F, it suffices to do so for E = B. Since z +> (z, y(z)) 
and (z,¢) +> z are continuous and mutually inverse, y is a homeomorphism 
from B onto Y = y(B), which is, therefore, connected. By (R), Y is clearly 
open in X. On the other hand, if a sequence y(z,,) € Y converges to a limit, 
the z, = ply(zn)] converge to some a € B, and so limy(zn) = y(a). Asa 
result, Y is open and closed in X, and connected as the image of B, qed. 

On the other hand, if Y is a connected component of X, then p(Y) is a 
connected component of B, and so p(Y) = B if B is connected. Firstly, Y 
being open in X and p being a local homeomorphism, p(Y) is open in B. Let 
a € B be a closure point of p(Y) and D a connected open neighbourhood of 
a in B satisfying (R). As D meets p(Y), p-'(D) = UD; meets Y. If some 
connected open D;, meets Y, then Y U D; is a connected open set containing 
Y. Hence D; C Y and a € p(D;) C p(Y), so that p(Y) is closed, qed. 

Besides, Y is clearly a covering space of B: apply (R) by replacing X by 
ee 

Finally suppose that, for any connected component Y of X, the map 
p:Y —> B is injective. It is then bijective if B is connected, and as p is a 
local homeomorphism, it is a global homeomorphism from Y onto B. As a 
result, Y is the image of B under a global section y of X and the covering 
(X, B, p) is trivial. 


(iii) Path-lifting. Let y : I —> B be a continuous path in B, where 
I = [0,1]. A lifting of y is a path w : I —> X such that pop = ¥ (§4 
of Chap. IV for the case of Log z). Such a lifting always exists, but there is 
better still: 


Theorem 2. Let (X,B,p) be a covering andy: I —>+ B a path in B. There 
is a unique lifting u from y to X with a given initial point. If two paths yo 
and y, in B are fixed-endpoint homotopic and if Uo and py are liftings of yo 
and 71 having the same initial point, then go and [41 have the same terminal 
point and are homotopic. 
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We first prove the uniqueness of jz. In the neighbourhood of some to € J, 
a lifting u(t) of y, being a continuous function of t, takes its values in a 
neighbourhood of f(to) mapped homeomorphically by p : X —+ B onto a 
neighbourhood D of y(to) satisfying axiom (R) . Hence 4 = yoy in D, where 
y is the unique section of X over D mapping (to) onto p(to). It follows that 
the set of point where two liftings coincide is open and closed in J, proving 
uniqueness. 

To prove the existence of yz, choose a € X such that p(a) = 7(0) =a 
and, as in §4 du Chap. IV, consider all the couples (J, 4), where J C I is an 
interval with initial point 0 and y is a continuous map from J to X satisfying 
poj=~7on J, as well as ~(0) = a. The existence of such couples as well as 
their “coherence” is clear — Take J sufficiently small so that X is trivial over 
(J): if (J’, nw’) and (J”, w”) are two such couples, then, by the uniqueness of 
liftings, uw’ = yp” in J’ J”. The union of all these J gives a couple (Jo, 110) 
such that jg cannot be extended beyond the terminal point b of Jo. But as 
X is trivial over the connected open neighbourhood D of y(b) in B, if b’ € Jo 
is sufficiently near b for y(b!) € D to hold, then there is a section of X in 
D equal to fio(b’) at y(b'), which makes it possible to extend fg beyond b if 
b<1,andat bifb=1.S0b=1. 

Next, consider a homotopy a : J x I = K —+ B between paths yo and 71 
of the statement, whose initial and terminal points and will be denoted by a 
and b; hence 


(3.2) a(s,0) =a, o(s,1)=6 forall s, 
o(0,t) = y(t), o(1,t) = y(t) for all t. 


Let a be the common initial point of vo and iy. The problem consists in 
constructing a continuous map o’ : K —+ X satisfying poo’ =o and 


(3.3) o’(s,0) =a forall s, 
o'(0,t) = pio(t), o’(1,t) = p(t) for all t. 


In fact there is no need to require o’(s,0) to be independent of s, because 
condition poo’ =o shows that s + o’(s,0) is a continuous map from I to 
the discrete space p~'(a), and so is constant. Conditions (3) could even be 
replaced by the unique condition that o’(0,0) = a; indeed, relation poo’ =o 
shows that (1) o’(s,0) takes its values in p~+(a), and so is constant, (2) o’ (0, t) 
is a lifting of yo having the same initial point as jug, and so is equal to po, 
(3) o’(1,t) is a lifting of 7 having the same initial point as 41, and so is 
equal to [u1. 

Like in the previous case, the uniqueness of a’ is clear since K = I x I is 
connected. 

As o(B) is compact and o is uniformly continuous, (1) shows that there 
exists r > 0 such that X is trivial over o(D) for all discs D of radius r in 
K. If we draw a grid on K consisting of the lines s = i/n and t = j/n, with 
0 <i, 7 <n, where n is sufficiently large, then X is trivial over the images 
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o(Ki;) = Hi; of the compact squares K;; thereby obtained. If a section y;; 
of X over each H;,; is chosen, for the moment arbitrarily, setting o} 5 = PigOo 
in Kj; gives 


/ . 
poo,,=o0 in Kj. 


If some squares K;; (at most four) have a common point x, taking n 
sufficiently large, they may be assumed to be all contained in the open disc 
centered at x and of radius r; as these K,; are connected, and hence so are 
the H;; and their union, if the sections y;; are taken to be equal at o(2), 
then they are the restrictions to these H;; of the unique section over their 
union, whose value at x is the same as theirs. 


Fig. 3.2. 


To choose the y;; so that the maps Ci are the restrictions to K;; of the 


map o’ sought, order the Kj; into a simple sequence K;(1 < i < n?) as 
indicated in the above figure. In H; = a(K1), choose the section y; equal to 
a = po(0) at o(0,0) and define 


o,=y100 in Ko. 


As t+ 0’(0,t) is a lifting to the interval [0,1/n] of yo whose initial point is 
the same as that of jo, 


(3.4) o4(0,t) = wo(t) for O<t <1/n. 
As o(s,0) =a for all s, similarly 
(3.5) a, (3,0) =e@ for 0< 6< 1/n. 


Having done this, in Hj = o(K2), choose the section y2 equal to vy; over the 
image of the common side of the two squares Ky and Kg, then the section v3 
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over H3 equal to yg over the image of the common side of the two squares 
Ky and Ky, and so on, and define of = y;, oo in Kj. 

Suppose that the existence of a map o’ in K, U...U K; coinciding with 
o; in each K;,j < i has been proved. The definition of o’ will extend to 
K,U...U Kj4, if and only if of,, = oi in Kj41 7 K;, where j < 7, when 
this intersection is non-empty. Let us do it for i = 10 and j = 5 (figure!). 
By construction, the partial liftings oj) and of are equal in Kig M Ko, but 
the induction hypothesis shows that og and of are equal in the set Ky N Ks 
reduced to a point and hence non-empty. As this point also belongs to Ky, 
O49 and of are equal at this point, and so in Ki)M Ks as expected. 

This argument shows that there is continuous map o’ : I x I —> X such 


that poo’ =o. As it satisfies o’(0,0) = a, this proves the theorem. 


An immediate corollary is that all covering maps over I are trivial, as 
they have global sections: take B = J and for o take the identity map from 
I to B. The same result holds for J”, where n € N. 

More importantly : 


Corollary 1. Let (X,B,p) be a covering and suppose that X is connected 
and simply connected.'! Let yo and +, be two paths in B with the same 
endpoints and let pig and py, be liftings to X of yo and 7 having the same 
initial point. yo andy are homotopic if and only if wo and 41 have the same 
terminal point. 


If the condition holds, the two liftings are homotopic, since X is simply 
connected, and thus that is also the case of the given paths in B. The converse, 
which makes no assumptions on X, is the second statement of theorem 2. 

In particular, a closed path y in B is homotopic to a point if and only if 
some (and hence all) lifting of y to X is closed, of course provided that X is 
simply connected. 

To state the next corollary which solves an essential problem in Cauchy 
theory, take B = C* and consider the path 


u: t+ r.exp(27it) 


in B, i.e. the circle centered at 0 and of radius r > 0 traveled once counter- 
clockwise ; let nu be the path t 4 r.exp(27int), i.e. the circle centered at 0 
traveled n times counterclockwise if n > 0, or —n times clockwise if n < 0; 
obviously he choice of r has no impact on the homotopy class of nu in C*, 
the latter being all that matters to us in what follows. Hence r = 1 may be 
assumed. 


Corollary 2. For any closed path y in C*, there is a unique integer n such 
that y is homotopic to nu: t+ e(nt). 


'T ie. having the following property: two arbitrary paths with the same endpoints 


are always fixed end-point homotopic. 
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Indeed, the map e : C —> C* transforms C into a connected and simply 
connected covering space of C* [(i), Example 1]. A lifting of 7 is a path pu 
in C such that e[u(t)] = y(t) for all t, in other words, is a unform branch 
of Logz along y in the sense of Chap. IV, §4, up to the factor 277. As ¥ is 
closed, (1) = w(0)+n for some n € Z, so that ju is fixed-endpoint homotopic 
to the rectilinear path 


tr (1—t)u(0) + tu(1) = (1-thatté. 
As a result, y is fixed-endpoint homotopic to the path 
tr >el[(l—tha+tS] = 7(0)e(nt), 


i.e. to nu. The end of the proof can be left to the reader. 

The corollary shows that y is homotopic to a point in C* if and only if 
the variation of Log z or, equivalently, of Arg z over + is zero. If the variation 
of the argument is 27n, then y is homotopic to a circle centered at 0 traveled 
mn times. 

In the previous statement, C* could be replace by a pointed open disc 
centered at 0, for example 0 < |z| < 1; it suffices to argue in the Poincare 
half-plane rather than in C. 

Exercise 2. Extract from the preceding arguments a direct elementary 
proof of corollary 2. 


(iv) Coverings of a simply connected space. We are now ready to prove 
one the main results of the theory: 


Theorem 3. Every covering (X,B,p) of a simply connected and locally 
connected space is trivial. 


It all amounts to showing the existence of a global section with given value 
a € X at a given point a € B. To define it at an arbitrary z € B, connect a 
to z by a path y and consider the lifting w of y with initial point a. If 7 is 
replaced by another path y’ connecting a to z, the terminal point (1) does 
not change because, B being simply connected, y and 7’ are fixed-endpoint 
homotopic, and hence so are their liftings. A map f : B —> X can, therefore, 
be defined without any ambiguity by setting f(z) = u(1). So p[f(z)] = z for 
all z € B. For good reasons, this is very much like the construction of a 
primitive of a holomorphic function on a simply connected open set. 

To show that f is a section, it suffices to prove that it is continuous at 
every z € B. But let D be aconnected open neighbourhood of z over which X 
is trivial. To calculate f(z’) for z’ € D, choose once for all a path 7 connecting 
a to z. Let it be followed by a path 7’ connecting z to z’ in D. To lift the new 
path to X, lift ~ onto a path yw with initial point a and terminal point f(z) 
by definition, then lift 7’ onto a path with initial point f(z); its terminal 
point is f(z’). But f(z) belongs to one of the connected components D; of 
p *(D), hence so does the lifting of 7’ since its image is a connected subset 
of p-'(D). As a result, f(z’) is the point of D; projecting onto z’, qed. 
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In the next statement, B will be said to be locally simply connected if all 
neighbourhoods of z € B contain a simply connected open neighbourhood. 
A trivial but fundamental example: any C° manifold since, in a Cartesian 
space, a ball is simply connected and even contractible. 


Theorem 4. Every connected and locally simply connected space B has a 
connected and locally simply connected covering (X, B,p). It is unique up to 
isomorphism. If (Y,B,q) is a connected covering of X and if a € X and 
B €Y are such that q(8) = p(a), then there is a unique continuous map f 
from X onto Y such that 


p=qeof, fla)=6, 
and (X,Y, f) is a covering of Y. 


The proof consists of several stages. We will indicate their outline and 
leave to the reader the task of completing them. 

(a) Choose a point a € B and consider the set of all paths y : J —> B with 
initial point a. Two homotopic (understood to be fixed-endpoint homotopic) 
paths are considered to be equivalent. Let X be the set of classes of these 
paths. Define p : X —> B by associating to each path its terminal point. a 
will denote the class of the constant path t > a. 

(b) Let D Cc B be a simply connected open subset, z a point of D and 
¢ € X the class of a path y connecting a to z. Denote by D(¢) the set of 
classes ¢’ of paths 7’ = y.6 obtained by adjoining to y a path 6 with initial 
point z in D; ¢’ does not depend on the terminal point of 6 since D is simply 
connected. From this, the map p from D(C) to D is deduced to be bijective. 

The set D(¢) remains invariant if z is replaced by some 2’ € D and ¥y by 
/ = y.0, where 6 connects z to z’ in D; because if we connect z and 2’ toa 
z" € D by the paths € and e’ in D, the paths 7.e and y.0.e’ connecting a to 
z” are homotopic. Hence 


(3.6) le D(C) = DG) = D(C’). 


To simplify the language, we say that a set of the form D(C) is, so to 
speak, a “disc” in X. Let D(¢) and D'(¢’) be two “discs” in X, y and ¥’ 
paths with initial point a and terminal points z and z’ in B with homotopy 
class ¢ and ¢’, ¢” a point of D(¢) N D’(¢’) corresponding to a path y” in B 
with terminal point 2” € DM D’, and D”’ Cc DN D’ a simply connected 
neighbourhood D” of z’’. By definition, there exist paths 6 and 6’ with initial 
points z and z’ in D and D’ and terminal point z”’ such that ” is homotopic 
to both y.6 and ¥’.6’. Then 


(3.7) D' (6") CD(QAD' (¢). 
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Fig. 3.3. 


The above figure shows the constructions needed to get the result. 

(c) For given D and z € D, the D(¢) with p(¢) = z, whose union is 
p ‘(D), are pairwise disjoint. Indeed, if ¢ and ¢’ are the classes of the two 
paths y and 7 with terminal point z and if ¢” € D(¢)M D(¢’), there is a 
path 6 in D with initial point z and terminal point p(¢”) such that 7.6 and 
7.0 are homotopic; then so are y and 7’ as well (exercise!). Hence ¢ = ¢’. 

(d) Let us say that a set U C X is open if, for all ¢ € U, D(¢) C U for 
sufficiently small D. Verifying axioms (unions and intersections) is easy — use 
(7) for intersections —, and (6) shows that any “disc” is open in X. As the 
open subsets D are arbitrarily small, every open subsets of X is the union of 
the D(¢) contained in it. 

To check that the space X is separated, it suffices to show that relation 
D(¢)ND'(¢') = @ holds if ¢ 4 ¢’. This is clear if p(¢) 4 p(¢’) : choose D and 
D' such that DN D! = @. If z = 2’, this follows from (c). 

(e) The map p : D(¢) —> D is a homeomorphism. Continuity follows 
from the fact that, if z’ € D’ Cc D, then the set 


ri(D)= YU ve) 


pC! )=2' 
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is open in X. Since, on the other hand, p maps every “disc” D(¢) onto some 
open set D, p maps every open subset of X, and in particular of D(¢) onto 
some open subset of B. As p: D(¢) —> D is bijective, it is a homeomorphism. 
The fact that (X,B,p) is a covering of B is now clear. 
(f) X is connected. Indeed, denoting by a the class of the constant path 
tr a, if y connects a to z € B, the classes of the paths 


y,: tH > (st), ste, 


define a path in X connecting a to the class ¢ of ¥. 

To show that X is simply connected, observe that if there are two paths 
pand p’ in X with initial point w and the same terminal point ¢, then these 
are the liftings of two paths 7 and 7 with initial point a and terminal point 
p(¢) ; however, the endpoints of yw and y’ inX are, by definition, the homotopy 
classes of y and y’; y are y’ are, therefore, homotopic, hence so are u and p’ 
as well (theorem 2). Taking into account the connectedness of X, we move 
on from here to paths having an arbitrary initial point. 

(g) Let (Y,B,q) be a connected covering of B and let ae X, 8 € Y be 
such that ¢(6) = p(q@). For all ¢ € X, there exists a path y connecting a 
to ¢. It is unique up to homotopy since X is simply connected. If y is the 
projection from yz onto B, then there is a unique lifting v of y to Y having 
8 as initial point. As y is unique up to homotopy, the terminal point f(¢) of 
v only depends on ¢. This gives the map f sought. Its uniqueness is due to 
the fact that this is obviously the only way it could possibly be defined. That 
(X,Y, f) is a covering is more or less obvious. 

If Y is simply connected, oppositely, there is a map g : Y —> X such 
that po g = q, g(8) =a. Clearly, f and g are mutually inverse. This gives 
the isomorphism of the simply connected coverings considered, qed. 

If B is a connected and locally arc-connected space, the connected and 
simply connected covering (X,B,p) of theorem 4 is the universal covering 
of B; by part (g) of the proof, it “dominates” all the others. Example 1 
above shows that due to the exponential map, C is a universal covering space 
of C*, while, due to the map t+ e(t), R the universal covering space of T. 

Exercise 3. Let G C C be a domain and (G,G,p) its universal cover- 
ing. Show that there is a complex analytic structure on G such that p is 
a holomorphic submersion. Let f be a holomorphic function on G and wy 
the inverse image under p of the differential form f(z)dz. Show that there 
is holomorphic function F on G such that dF = w ¢- What about the case 
G=C*? 


(v) Coverings of a pointed disc. Let D* be a pointed disc, i.e. either an 
open disc in C with its centre removed, or else the exterior of a closed disc 
in C (pointed disc “centered at oo”). Using a conformal representation, we 
need only consider the disc 0 < |z| < 1. Like all coverings of a Riemann 
surface, a covering (Y, D*,q) of D* has a natural Riemann surface structure: 
the analytic structure of the discs’ projections onto D* is transferred to the 
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discs and the compatibility clause is immediate. The following result will play 
an essential role in the construction of the compact Riemann surface of an 
algebraic function : 


Theorem 5. Let (Y,D*,q) be a connected covering of order k < +00 of 
the pointed disc D* : 0 < |z| < 1. Then there is a holomorphic function 
yp in Y such that (i) ¢ H v(¢) is a conformal representation of Y on D*, 


(ii) e(¢)* = 4(¢). 


This means that (Y, D*,q) is isomorphic to the covering of D* obtained 
by taking Y = D* and q(z) = z* [(i), Example 2], or to the covering obtained 
by constructing the Riemann surface of the algebraic function z!/" using the 
method of n° 2, or finally that the algebraic function z!/*, which obviously 
does not have uniform branches on the pointed disc, becomes uniform on Y: 
a k'" root of z = q(¢) can be associated to each ¢ € Y so that it only depends 
holomorphically on ¢ (but obviously not on z). 

(a) Choose an arbitrary point a of D* and some 3 € Y such that q(8) =a 
and associate to every closed path y with initial point a in D* the terminal 
point of its lifting v with initial point 6 in Y . As the terminal point of v only 
depends on the homotopy class of 7, by theorem 2 and its corollary 2, only 
the paths 7, : t 4 ae(nt) need to be considered; let v,, be the lifting of yp. 
The path 7,.7Yn-1 consisting in 7, followed by the opposite of 7, is clearly 
the path 7—n, up to parameterisation. If v;, and v;,, have the same terminal 
point, the lifting of yn is obviously the closed path 1,.v,,-1. If, conversely, 
the lifting ¥m—n of Ym—n is closed, the path vyz_n.V, with initial point ( is 
a lifting of Ym—n-Yn, a path, identical to 7», up to parameterisation. Then 
Um = Vm—n-Vn up to parametrization, so that the paths v,, and v,, have the 
same terminal point. 

This leads to two conclusions. First, if v,, and v, are closed, so is Um_—n ; 
the set of n such that 7, is lifted onto a closed path is, therefore, the subgroup 
pZ of Z. Secondly, the terminal point of 1, only depends on the class of 
nmodp. As any point of Y projecting onto a is the terminal point of such a 
lifting, we conclude that there are an many classes mod p as points of Y over 
a, ie. that p =k. 

(b) Having made this point, let us consider the universal covering (P, D*, e) 
of D*, where P is the half-plane Im(¢) > 0 and e the map ¢ +> e(¢) from 
P onto D*. If we choose some a € P such that e(a) = a, then theorem 4 
shows the existence and uniqueness of a continuous map f : P —> Y such 
that f(a) = 6 and g[f(¢)] = e(¢) for all ¢ € P. Let ¢’ and ¢” be two points 
such that f(¢’) = f(¢"). So ¢” = ¢’ + n for some n € Z. Setting e(¢’) = 2’, 
e(C”) = 2", f(¢’) = f(¢") = 7 € Y, the map f transforms every path pu 
connecting ¢’ to ¢” into a closed path v with initial point 7 in Y, which is a 
lifting of the path y, the image of w under e. Choosing as yp the rectilinear 
path [¢’,¢”] = [¢’,¢’ + n], clearly y(t) = z’e(nt). Since y can be lifted to a 
closed path v in Y, n = Omod k by part (a) of the proof, which can be applied 
to all points of D* and in particular to z’. Conversely, if n is a multiple of 
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k, f transforms [¢’, ¢’ + n] into a necessarily closed lifting of t > z’e(nt). So 
F(C) = F(¢"). 

It follows that f is a homeomorphism of the quotient P/kZ onto Y, which 
is obviously compatible with the complex analytic structures of both spaces 
and with the projections q : Y —> D* and p, : P/kZ —>+ D*, so that the 
given covering (Y,D*,q) is isomorphic, including from a complex analytic 
point of view, to the canonical covering (P/kZ, D*, py) of example 2 of this n°. 
Hence it remains to construct the function vy of the theorem for this particular 
covering. However, the map z +> e(z/k) from P onto D* is invariant under 
kZ; so there is one and only one map y : P/kZ —+ D* which, for all z € P, 
transforms the class of z mod kZ into e(z/k). The reader is left to check that 
y is holomorphic and satisfies conditions (i) and (ii) of the theorem. 

It applies to all disc centered at a € C; replace relation y(7)* = q(n) by 


o(n)* =q(n)—a if a#co, (m)* =1/q(n) if a=00. 


It applies also to coverings of C*: replace P by C in the preceding argu- 
ments. 


4 — The Riemann Surface of an Algebraic Function 
(i) Global uniform branches. Once again, consider an irreducible equation 
(4.1) P(z,€) = Po(z)C" +...+ Pr(z) =0 


of degree n in ¢ and, as in n° 2, remove from C the finite set of the values of 
z where (1) does not have n distinct roots. This gives an open subset B. As 
shown, the subset X C C? of the curve P = 0 which projects onto B is both 
a Riemann surface and a covering of order n of B. If (a,a) € X and if D 
is a sufficiently small open disc centered at a, there is a unique holomorphic 
function f(z) in D satisfying P[z, f(z)] = 0 and f(a) = a, and the couple 
(f(D), p) is a local holomorphic chart of X in the neighbourhood of (a, a), 
defined in the “disc” D(a) = f(D). As (z,¢) € f(D) means that ¢ = f(z), 
the functions p: (z,¢) 4 z and F’: (z,¢) ++ ¢ are holomorphic in this chart, 
and hence in X in the sense of n° 1. More generally, if f and g are polynomial, 
the functions (z,¢) H f(z,¢) and (z,¢) + g(z,¢) are holomorphic on X, so 
that (z,¢) + f(z,0)/g(z,¢) is meromorphic if g does not identically vanish 
on X (i.e. is not a multiple of the polynomial P). 

A first immediate consequence of these results concerns global uniform 
branches of an algebraic function. Their existence is governed by the follow- 
ing result, where we keep the above notation: 


Th 6. Let U be a simply connected open set contained in B. There are n 
holomorphic functions fy in U whose values at each z € U are the n roots of 
the equation P(z,¢) =0. 
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As X is a covering space of B, p~'(U) is a covering space of U, hence 
is trivial (theorem 3). So it has n sections z ++ (z, fx(z)), with functions f;, 
defined on U and locally, hence globally, holomorphic, qed. 

Our task is now going to be to complete X to obtain a compact Rie- 
mann surface X for which the map p: X —> B = C— S can be extended 
to a map X —> C which will be holomorphic without however satisfying 
axiom (R) of coverings in the neighbourhoods of the points z € C — B. For 
this, finitely many points need to be adjoined to X “over” the z € C-B 
and the holomorphic charts in the neighbourhood of these points need to be 
defined.!” 

This supposes that the structure of X over a neighbourhood of any a € = 
B is known; theorem 5 provides the answer. It will nonetheless be somewhat 
long, but there is no quick method, especially if we want to explain everything. 


(ii) Definition of a Riemann surface X. Let us return to the Riemann 
surface (X,B,p) of the equation P(z,¢) = 0 over the open subset B of 
C. For any given a € C — B, choose an open disc D(a) centered at a not 
containing any other point of C-B apart from a, and let Y be a connected 
component of p~!(D*(a)); it is a connected covering space of order k < n 
of D*(a). Let yybe a holomorphic function on Y satisfying the properties of 
theorem 5 with respect to D*(a) ; since 


(4.2) yy (z,¢)* =z—a resp. 1/z 


for all yn = (z,¢) € Y, yy(n) tends to 0 as p(n) =z tends to a. An “ideal 
point” (so Forster says) then needs to be added to Y as is done to go from C 
to C. Denote this point by ny, set Y = YU{ny} and assume that yy (ny) = 0. 
This gives a bijection from Y onto the open disc centered at 0 in C; thus the 
holomorphic structure of this disc can be transferred to Y. The connected 
component Y = Y — |ny| becomes an open subset of Y and the holomorphic 
functions on an open subset U of Y are those that can be expressed _ holo- 
morphically by using yy : equip Y with the holomorphic structure for which 
(Y, vy) is a holomorphic chart of Y and yy is a local uniformizer at ny. 
We show that the holomorphic functions on every open subset U of Y, 
that we are already acquainted with from the end of n° 2 (an open subset 
of Y is also open in X) are just the holomorphic functions of yy. First of 
all, U is the union of “discs” of X. The notion of holomorphy being a local 
one, we can confine ourselves to the case of a disc U, hence suppose that 
U = D'(8) for some (b,8) € Y, where D’ C D*(a) C B is a sufficiently 
small open disc centered at b # a over which the covering X, and hence 
Y, is trivial. D’(8) is the image of D’ under z +> (z, f(z)), where f is the 


"2 The following developments, and even some of the preceding ones, are strongly 
influenced by Chap. I of Otto Forster’s Lectures on Riemann Surfaces (Springer, 
1981). Let us also mention Hershel M. Farkas and Irwin Kra, Riemann Surfaces 
(Springer, 1980). 
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uniform branch in D’ of the algebraic function P(z,¢) = 0 which takes the 
value 2 at b. If U is regarded as an open subset of X, the end of n° 2 
tells us that the holomorphic functions of (z,¢) on U are the holomorphic 
functions of z on D’. So it amount to showing that the holomorphic functions 
of yy (z, ¢) on D’(8) are identical to the holomorphic functions of z. However, 
py (z,¢) = yy (z, f(z)) in D’(8). As yy is holomorphic on Y in the sense of 
n° 2 (theorem 5) and as z +> (z, f(z)) is a holomorphic map of z from D’ 
to X, the left hand side is a holomorphic function of z on D’. Oppositely, 
relation py (z, f(z))* = z—a resp. 1/z shows that z is a holomorphic function 
of py (2, f(z)) on D'(6), qed. 7 

The complete Riemann surface X sought is then obtained as follows: for 
each a € C— B, the points ny corresponding to the connected components 
of X over D*(a) are adjoined to X. These ny are considered to be pairwise 
distinct. If the covering space Y of D*(a) is of order k, (x, C, p), which is a 
covering only over B and perhaps over a neighbourhood of 00, is said to be 
a branched covering of C at the branch point ny of order k. This is not a 
property of the Riemann surface X, which is as smooth as possible a manifold; 
it is a property of the map p: aa, 

For example, if we take the algebraic equation ¢? — z = 0, then X is the 
set of (z,¢) € C? such that ¢? = z, z 4 0. Over a disc D which does not 
contain 0, the surface has two disjoint connected components corresponding 
to the two uniform branches on D of the pseudo-function z!/*. This is no 
longer the case over a disc D centered at 0, since, when z circles once around 
the point 0, the determination chosen at the start for z!/? becomes the oppo- 
site determination at termination; this means that two points of the surface 
projecting onto z can be connected by a curve, as in the case of the logarithm 
(Chap. IV, §4). The surface X is, therefore, connected over D* = D — {0}; 
this is also the case over D since as z tends to 0, the two possible values 
of z!/? also tend to 0, and only one point, namely (0,0), remains over the 
origin. There is a “branch point” in this case only because we are trying to 
express ¢ by using z in the neighbourhood of 0; if we tried to express z as a 
function of ¢, all would become normal again. But, even in the real domain, 
a general algebraic curve can have singularities (multiple points, cusp points, 
etc.) otherwise more complicated than a vertical tangent. 

In the general case, x being the union of X and of the Y, we define the 
topology of xX by declaring U C X to be open if UM X is open in X and if, for 
all Y such that ny € U, the set UN Y is open in Y. The Hausdorff axiom and 
the fact that X and the Y are open in X are immediately verified. Finally, the 
complex analytic structure of X is obtained by adjoining the charts (Y, yy) 
to the charts already available in X. As shown, these are compatible with 
the complex analytic structure of the open subsets Y, hence of X, defined in 
n° 2; they are also compatible with each other as they are pairwise disjoint. 
This provides X with the structure of a Riemann surface, which coincides 
with that of n° 2 in the open set X. 
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The map p from X onto C is obtained by setting p(ny), for each open 
set Y, to be the point a € C from which Y was obtained. If D(a) is the disc 
centered at a chosen to get the connected components Y over a neighbour- 
hood of a, the inverse image p~!(D(a)) is the union of open sets Y, and so is 
open in X, obviously so is also p~!(D’) for any disc D’ C D(a). As a result, 
p is continuous and, even better, maps every open subset of X onto an open 
subset of C. But X is not, strictly speaking, a covering of C. 

Let us show that X is compact. For each a € C—B, consider the open sub- 
set p~!(D(a)) of X. It is the union of open subsets of type Y. Let Dg C D(a) 
be a closed disc, hence compact in C, centered at a; for all Y C p~!(D(a)), 
the chart (Y,yy) of X transforms p~!(D,) MY homeomorphically into a 
closed, hence compact, disc centered at 0. As a result, p~'(Dz) is the finite 
union of compact sets, and so is compact. Similarly, for all a € B, there is a 
closed disc D, centered at a such that p~'(Dq) is compact. Since the interiors 
of the D, cover the compact space C, C can be covered by finitely many discs 
D,. Thus X is the union of finitely many compact sets, qed. 


(iii) The algebraic function F(z) as a meromorphic function on X. Let 
us now show that the functions (z,¢) + z and F’: (z,¢) +» ¢, defined 
and holomorphic on X have meromorphic extensions on x. Wiis enough to 
prove this on each open set Y, using the the chart (Y, yy) to verify it. If Y 
corresponds to a point a € C-B and if Y is of order k, then yy(z,¢)* = z—a 
or 1/z, whence the result related to (z,¢) ++ z; in particular, (z,¢) H z has 
a pole of order k at ny if this point projects onto the point co of C. As an 
aside, note that the point oo having been excluded during the construction 
of X, X may very well be a genuine covering space of a neighbourhood of 
co, in other words that k = 1 for each ny projecting onto oo; the Riemann 
surface of ¢(¢ — 1)z = 1 has two points over oo, the function ¢ taking values 
0 and 1 at these points. 

The case of the algebraic function F': (z,¢) ++ ¢ is less obvious. We will 
suppose that 7y projects onto a point a oo, the other case being similarly 
dealt with. As (Y, yy) is a holomorphic chant of Y and as F is holomorphic 
on Y=Y-— ny, there is a Laurent series expansion 


(4.3) F(z,0)=6=Sieng™ =h(q), = ¥y (2,6) 
Z 


in Y, and it all amounts to showing that c,, = 0 for sufficiently large n < 0. 
Since P(z,¢) = 0 and q* = z—-a, 


Po (q* +a) h(q)" +...+ Pn (q*) =0 
and so 


(4.4) h(q)” + si(q)h(q)"* +... + Sn(q) =0 for q #0. 
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The rational functions 
si(q) = P; (q* +a) /Po (g* +2) 


are meromorphic at the origin; these are elementary symmetric functions 
of the roots of (4), up to sign. To show that h(q) has at most one pole at 
q = 0, it suffices to show that there is an upper bound of the form h(q) = 
O(q-). Now, the coefficients s;(q) are of this form. Therefore, the following 
result remains to be proved. It probably dates back to time immemorial and 
for example can be found in Dieudonné’s exercise in Calcul infinitésimal, 
Chap. III: 


Lemma 1. Let ¢;(1 <i<n) be the roots of an equation 
(4.5) Co +P +... ten =0 
with complex coefficients. Then 


(4.6) sup |¢;| < max (1,]c1] +... + |en|) - 


Let M be the left hand side of (6). Each root of (5) satisfies 
Gil” S Jeal [GI t+... + Leal S lea -M™2 +... + Leal 
whence 
M” < |e,|.M™* +...+ lel . 
If M < 1, there is nothing to prove. If M > 1, we then write 
M <|ey|+...+|e,| /M"* < len| +... + lel , 
qed. 


Note that if Po(a) 4 0, the coefficients s;(q) of (5) are holomorphic at 
q = 0, and so are bounded for sufficiently small g, hence so is h(q). As 
a result, the function F(z,¢) = ¢ is holomorphic at the point ny € X if 
Po(a) £ 0, in other words if the equation P(a, Y) = 0 is effectively of degree 
n at the point a. Since P(z,¢) = 0 in Y, the value of ¢ at the point ny is 
obtained by passing to the limit as (z,¢) converges to ny, hence is a root of 
P(a,¢) =0. 

On the contrary, if Ro(0) = Po(a) = 0, then Po(X +a) = X™Qo(X) with 
Qo(0) 4 0, the coefficients 


si(q) = P; (q* +a) /g”*Qo (4°) 


can have poles of order < mk at the origin and so are O(q~™*). Thus so 
is h(q) as well. The exact order of ¢ at the point ny can theoretically be 
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calculated by the Newton polygon method,'* but the latter has fortunately 
long disappeared from presentations on Riemann surfaces. 

In all cases, the situation over the point a is then easily elucidated. Let 
us return to the disc D(a) centered at a used above and let Yi,..., Ym be 
the various connected components of X over D*(a). Each Y; is a connected 
covering space of order k; of D*(a), and ki +...+km = n since there 
are n points of X over each point of B. If q(z,¢) denotes the function yy 
corresponding to Y = Yj, so that (Y;,q) is a chart of X, then 


(4.7) qa(z,™=z-aifafc, =1/z2 if a=oa; 


(z,¢) + ¢ is a meromorphic function h;(q;) on Y;, possibly with a pole at 
q; = 0, hence a meromorphic Laurent series in the local uniformizer q;. The 
points k; of Y; located over a given z € D*(a) correspond to the k; values of 
qi satisfying (7); they can be deduced from each other by taking the product 
of q; and of the k;-th roots of unity. For z € D*(a), equation P(z,¢) = 0 has 
exactly k; distinct roots such that (z,¢) € Y;; they correspond to the distinct 
k; points of Y; located over z; these are the numbers obtained by replacing 
qi, in the series h;, by the k; k;-th roots of z—a or of 1/z. As (z,¢) € Y; tends 
to the point ny = 7m; adjoined to Y = Yj, ¢ tends either to the finite limit 
¢; = h;(0) satisfying P(a,¢;) = 0, or to infinity. If Po(a) 4 0 and a 4 ~w, 
we have seen that the h; are holomorphic at the origin and we deduce that, 
for all r > 0, there exists p > 0 such that, for |z —a| < p, the equation 
P(z,¢) =0 has at least k; simple roots satisfying |z— ¢;| <r; as a result, by 
lemma 1 of n° 2, the order of multiplicity of the root ¢; of P(z,¢) = 0 is at 
least k;. 

As 3° k; = n, this result suggests that the connected components Y; of 
X over D*(a) correspond bijectively to the various roots of the equation 
P(a,¢) = 0, each multiple root ¢; of order k; giving rise to a component Y; 
of order k;. 

False: ¢; = ¢; may hold for i 4 j, in which case the multiplicity of the 
root ¢; is at least k; + k;. 

Exercise 1. Consider the equation 


(4.8) C—26—24 =0. 


It has double roots in ¢ for z = 0, i/2 and —i/2. Therefore, the Riemann 
surface X constructed in n° 2 is the subset of the graph of relation (8) in C? 
located over the open subset 


B=C- 10947} 


'3 The shortest presentation, but not necessarily the most accessible, is that of 
Dieudonné in Calcul infinitésimal, Appendix to Chap. III. The fact that he only 
looks for real branches and limited expansions instead of Laurent series in q so 
as not to traumatize his novice readers at the outset does not change anything. 
Besides, the method applies to functions that are not necessarily algebraic. 
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of C. Setting ¢ = zT, the equation becomes tT? — tT = 27, which has two 


uniform branches in the disc D, i.e. the disc |z| < 1/2. Show that these are 
given by 


(4.9) g(z)=1+27 4+... 
(4.10) go(z) = —2? +244+.... 


Show that p~!(D*) is the union of the graphs Y; and Y2 over D* of the 
functions 


filz) =2m(z)=z+22+... and fo(z) =zge(z) =—z?4+2°4... 


and that they are the two connected components of p~!(D*) despite the fact 
that the equation P(0,¢) = 0 has a root. Show that, like (z,¢) H z and 
(z,C) + ¢, the meromorphic function 7 : (z,¢) 4 ¢/z on X takes values 1 
and 0 at the two points 7 and 12 of = projecting onto z = 0. Can it be 
attributed a value greater than 0 by considering the graph of equation (8)? 


When Po(a) = 0, equation P(a, ¢) = 0 has strictly less than n roots ; to get 
n roots, replace ¢ by 1/¢ = ¢’, in other words replace the initial polynomial 
P(X,Y) by 


Q(X, Y) =Y¥"P(X,1/Y) = Py(X)¥" +... + Po(X). 


For X = a, 0 is a root, which can be interpreted by saying that P(a,¢) = 0 
admits the root oo with an order of multiplicity r equal to that of the root 0 
of Q(a,¢’) = 0; the latter is given by the relations 


Po(a) =...= r-1(a) =0, P,(a) #0. 


As shown by these relations, the degree of the equation P(a,¢) =0 is n—r, 
it has n roots in all, namely n — r finite roots and an infinite root of order r. 


(iv) Connectedness of X. 


Theorem 7. The Riemann surface of an irreducible algebraic equation is 
connected. 


It suffices to prove this for the open subset X of X because the point ny 
adjoined to X can obviously be connected to points of X by paths. 

As seen in section (ii) of the previous n°, like X , any connected component 
X' of X is a covering space of B. Let k be its order, so that for any disc D C B, 
the open set p-'(D).X’ is the union of “discs” D;(1 < i < k) corresponding 
under z ++ (z, f;(z)) to uniform branches on D of our algebraic “function” 
zt>¢. The identity 


][ @- fl) =T* + a(e\T* 1 +... + en (z), 
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where T is an indeterminate, shows that the f;(z) are the roots of the poly- 
nomial 


(4.11) P!(T) =T* 4+-.e,(z)T* 1 4+...4+4(2) =0 


with coefficients in the ring of holomorphic functions on D (nothing to do 
with the derivative of P). But for all z € D, the c;(z) are the elementary 
symmetric functions of the f;(z), i.e.the roots of the equation P(z,¢) = 0 
such that (z,¢) € X’. It follows that the c;(z) are the same for all discs 
D Cc B and hence are the restrictions to D of holomorphic functions on B. 
We show that these are rational and to do this, that they only have polar 
singularities at every a € C=28; 

As in (ii), consider the disc D(a) = D; over D*, X is the union of open 
connected sets Y; hence the same holds for X’. If Y Cc X’ and if ¢g = yy (z,¢) 
is the corresponding local uniformizer, ¢ is a meromorphic function of g on Y 
with at most one pole at gq = 0. For z € D* the roots ¢; of P(z,¢) = 0 such 
that (z,¢;) € Y give an upper bound ¢; = O(q-N) as q tends to 0. But if the 
order of the covering space Y of D* is equal to r, then g” = z —a or 1/z as 
the case may be, whence ¢ = O((z — a)~%) or O(z) for another integer N. 
Hence elementary symmetric functions of ¢; = f;(z) such that (z,¢;) © X’ 
satisfy similar upper bounds properties. Thus, being holomorphic on B and 
at most of polynomial growth in the neighbourhood of C—B, the coefficients 
of (11) are indeed rational functions of z. 

As a result, for each connected component X’ of X, (11) is an algebraic 
equation with coefficients in the field K of rational functions of z. Denoting by 
X,; the connected components of X and by P; the corresponding polynomials, 
it becomes clear that 


(4.12) [[ P(r) =[ [7-9 =7" + si(z)T** 4+... + Sa(z) = Plz,T), 


where the product is extended to all roots ¢ of P(z,¢) = 0 and where, in 
consequence, the coefficients are the rational functions s;(z) = P;(z)/Po(z) 
already encountered in (4). Multiplied by Po(z), this identity between poly- 
nomials in T with coefficients in the field K = C(z) of one variable rational 
fractions shows that in the ring K[T], the P;(T) divide P(T). The coefficients 
of the P; are not necessarily polynomials in z, but we know (exercise below) 
that if P is irreducible, i.e. does not have any non-trivial divisors with coeffi- 
cients in C[z], then neither does it have any with coefficients in C(z). Hence 
there is a unique index j, and X is connected, qed. 

Exercise 2. Let f(Y) = >> axY* be a polynomial with coefficients a, € Z. 
The gcd c(f) of these coefficients is said to be the content of f. (a) Let 
f,g € ZY] and let p be a prime number dividing c(f4g), i-e. all the coefficients 
of fg. Show that p divides c(f) or c(g). [As p divides agbo, it divides ao 
or bo; if p divides ap but not all the coefficients of f, let r be the largest 
integer such that p divides ao,...,a,—1. By calculating the coefficients of 
Yy",Y"*1,... in fg, show that p divides bp, then b,, etc.] (b) f is said to 
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be primitive if c(f) = 1. Show that if f and g are primitives, then so is 
fg (“Gauss’ Lemma”). (c) Show that c(fg) = c(f)c(g) for all f,g € ZX]. 
(d) Show that the preceding arguments and results continue to hold if Z is 
replaced by the ring A = k[X] of polynomials in one variable with coefficients 
in the field k [replace “prime” with “irreducible”] and more generally by 
an arbitrary principal ring A. (e) Let P(X,Y) be an irreducible polynomial 
with coefficients in a field k. Show that as a polynomial in Y with coefficients 
in the field K = k(X) of rational fractions, P is still irreducible. In other 
words: if P has a non-trivial divisor in K[X] = k(X)[Y], then it also has one 
in kLX, Y]. 


(v) Meromorphic functions on X. 


Theorem 8. All meromorphic functions on X are rational functions of z 


and ¢. 


We will assume a result from the general theory of commutative field 
extensions though it is not hard to show. Otherwise the proof given will be 
complete. 

Let y be a meromorphic function on X. It has finitely many poles pro- 
jecting onto points a; € C. Let By be the open set obtained by removing 
from B the points a; belonging to it. If, for some z € By, the n (distinct) 
points of X over z are numbered by (z,¢,), then the expression 


(4.13) [[ @-¢.@)) =2" + e1(z)P" -1 4+... + en(2) 


is well-defined. Its coefficients, i.e. the elementary symmetric functions of 
(Zz, Ck) up to sign, are defined on all of B,. These are holomorphic functions 
on By, because, over a sufficient small disc D C By, centered at a, the covering 
space X decomposes into “discs” D(a,,) on which ¢ is a holomorphic function 
of the local uniformizer gq, and so of z, whence the result. As, on the other 
hand, the function y is O(q~’) in the neighbourhood of any point that does 
not project onto B,, where q is the local uniformizer at this point, the c;(z) 
have at most poles at the point of C — By, hence are rational functions of z. 

If M denotes the field of meromorphic functions on X and if every rational 
function f(z) on C is identified with the function (z,¢) 4 f(z) on X, then 
it follows that every y € M is algebraic of degree <n on K = C(z). 

On the other hand, M contains the field L of rational functions of z and 
¢. As seen at the end of the previous section (iv), the polynomial P(X, Y) 
is irreducible as a polynomial in Y with coefficients in kK. The following very 
simple result then shows LZ has dimension n over K : 


Lemma 2. Let M be a commutative field, K a subfield of M and Can element 
of M satisfying an irreducible algebraic equation of degree n over K. Then 
the subfield L of M generated by K and ¢ has dimension n over K and admits 
1,¢,...,¢°! as a basis over K. 
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Since ¢” is a linear combination of 1,¢,...,¢"~! with coefficients in K, 
the same is true for ¢? for all p> n: 


CPt? = C (eo +... + p—10"7") = CoO +... + n-16" = 
=coO+...4+ eT Sie + Cn-1 (co i Gace) ; 


etc. (induction on p). As a result, the set A of polynomials in ¢ with coef- 
ficients in K is a vector space of dimension < n over K. It is a subring of 
M and, for alla € A, the map u: «+> ax from A to A is linear over K; it 
is injective if a £ 0 since we are in a field. As A is finite-dimensional, u is 
bijective. So there is some x € A such that az = 1. Asa result, A is a subfield 
and L = A. If L had dimension < n over K, there would be a non-trivial 
linear relation between 1,¢,...,¢"~1 with coefficients in K. In other words, 
¢ would satisfy an equation of degree < n over K, a contradiction. 
Hence, returning to meromorphic functions over x ; 


KCLCM, dim,(L)=n. 


To prove that M = L, it, therefore, suffices to show that dimg(M) < n. 
However, y € M is known to satisfy an algebraic (not necessarily irreducible) 
equation of degree n over K; so it suffices to prove (or to admit without proof, 
which is what we will do here) the next general result : 


Lemma 3. Let M be a commutative field of characteristic 0, K a subfield of 
M and n an integer > 1. Suppose that alla € M satisfy an algebraic equation 
of degree < n with coefficients in K. Then, the dimension of M as a vector 
space over K is <n. 


This lemma is itself based on the primitive element theorem due to 
Dedekind for the field of algebraic numbers and valid for all fields of charac- 
teristic 0: if all c © M are algebraic over K, then for any finitely number of 
elements 21,...,%,) € M, there exists x such that the subfield K[x1,..., xp] 
generated by K and the 2; (these are obviously the polynomials in x; with 
coefficients in kK: apply lemma 2 p times) is equal to K [a]. If we admit this 
result, then with the assumptions of lemma 3, dimx K[x1,..., tp] < n,which, 
for p= n+1, shows that n+ 1 elements of M can never be linearly indepen- 
dent over K, qed. 


(vi) The purely algebraic point of view.‘4 I will not go any further in 
this theory. Let us, however, say a few words about another method for 
associating a Riemann surface to any algebraic function field of one variable 
over C . This is the name given to any field L containing the field K = C(X) of 
rational fractions in one variable and finite-dimensional over K. The primitive 
element theorem shows that LD is necessarily isomorphic to the field of rational 


™ Serge Lang, Introduction to Algebraic and Abelian Functions (2nd ed., Springer, 
1982). 


4 — The Riemann Surface of an Algebraic Function 307 


fractions in z and ¢, where ¢ is an algebraic function of z; hence, if the base 
field is C, this is not a generalization. This point of view is different because 
we do not a priori choose some ¢ € LE such that L is the set of polynomials in 
¢ with coefficients in C(z). That being the case, how can a Riemann surface 
X associated to L be constructed? 

The basic idea is very simply and with some technical changes applies to 
far more general fields than C. The x € ZL must be meromorphic functions on 
X. If that is the case, a value «(P) € C can be assigned to x at each point 
Pex ; it must satisfy the following conditions: 


(P1) the set of x such that «(P) 4 00 is a subring o(P) of L, and the map 
P ++ 2(P) is a homomorphism from o(P) onto C; 

(P2) the relation x(P) =o implies that «~'(P) = 0; 

(P3) «(P)=~- for alla €C. 


Any map P: +> «(P) from L to C satisfying the previous conditions is, 
by definition, a place of the field L. Having said that, the Riemann surface 
sought is the set of these places, a set on which a compact Riemann structure 
is then defined. 

Some authors define the places de L by using the associated rings o(P), 
which can be characterized directly: they must contain C and satisfy 


sgo=>ax' 


€o. 

A subring of a field K with this property is a valuation ring of K. An example 
in the field Q of rational numbers is the set of fractions whose denominator 
does not contain a given prime p. An example in C(X) is the set of holomor- 
phic rational fractions in some given a € ¢. 

Exercise 3. We thus obtain all the valuation rings of Q, and of C(X) 
containing C, which corresponds to the fact that the Riemann surface of 
C(X) is C. 

Other authors prefer to define the places of a field L of algebraic function 
by using valuations . For them, a point of the Riemann surface is a map 


v:LbL—Z 
satisfying the following conditions: 
v(xy) = v(x) + o(y), ve +y) 2 min (v(x), o(y)) 


and u(x) = 0 if « € C. This definition corresponds to the fact that at every 
point P of the Riemann surface of L, its order vp(«) at P can be associated 
to each point x € L since in the neighbourhood of P, the function x is a 
Laurent series in a local uniformizer. The reader will have no difficulty in 
determining all the valuations of C(X), or of Q. 
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Ultimately, the points of a Riemann surface of a field L of algebraic func- 
tions are, interchangeably, the places, the valuations rings or the valuations 
of L. We go from a valuation ring o to a place by observing that the « € o 
that are not invertible in o form an ideal p of o and that the quotient o/p 
is isomorphic to C; for x € o,a(P) is then the class of «modp, and set 
x(P) =o if « ¢ o. On the other hand, it can be shown that the only ideals 
of o are the pairwise distinct powers p” of p; for x € 0,vp(x) is then the 
smallest n such that 2 € p”, and for x ¢ o set vp(x) = —vp(a~). In fact, 
p is the set of the x € LZ that vanish at the point P of the Riemann surface, 
and p” the set of x having a zero of order > n at P. 

The relations with the theory of algebraic curves also need to be men- 
tioned. 

The algebraic point of view was invented by Richard Dedekind and Hein- 
rich Weber! about thirty years after Riemann. His rather obscure and vague 
constructions, a fortiori the “verbosity” of his far less brilliant successors, 
must have annoyed the crystal clear minds of these two algebraists. As 
Dedekind had already provided a clear and in many respects final form of 
the theory of algebraic number fields,!® he naturally tried to apply similar 
methods to algebraic functions fields of one variable by replacing Q by C(X). 
About thirty years later, Hermann Weyl, Die Idee der Riemannschen Fldache, 
introduced the first correct ideas about “abstract” 1-dimensional complex 
manifolds into this question and used Dirichlet’s principle to prove a pri- 
ori the existence of “many” meromorphic functions on Riemann surfaces. 
This result, which is obvious for surfaces associated to algebraic functions, 
requires even now a long proof in the general case. In the 1930s and 40s, in 
particular thanks to André Weil and Oscar Zarisky, what a n-dimensional al- 
gebraic variety over an arbitrary field is was beginning to be understood and 
a purely algebraic mechanism was being set up. It was an improvement on the 
doubtful geometric arguments of the Italian school (which nonetheless had 
discovered results and introduced very important ideas since 1870). Start- 
ing with the notion of a place and ending with that of a complex analytic 
manifold, Claude Chevalley published the first post-war modern presenta- 
tion on algebraic functions in one variable. Reading it is still advisable. Serge 
Lang’s book goes much further in about a hundred pages, which indicates 
how concise the proofs are... 


© author of a Lehrbuch der Algebra where almost everything that was known in 


algebra around 1900 can be found. 

'6 It has been substantially improved on and generalized, but without fundamen- 
tally changing its point of view other than by introducing the notion of valuation. 
Reading his main articles that can be found in his complete works, is still a most 
advisable exercise. 
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Towards the end of the 1950s, Alexandre Grothendieck, influenced by a 
seminar given by Chevalley on algebraic varieties and by Jean-Pierre Serre’s 
introduction of sheaf theory into the question, entered the picture and made 
the theory so general and abstract that it can no longer be understood any- 
more, unless one reads his 30NP,!” written by Dieudonné whom nothing 
could ever stop, and those of his disciples; better enter a convent. Neverthe- 
less, because of Grothendieck’s theory of schemes which consists of algebraic 
geometry over a ring rather than a field, it has been possible to solve previ- 
ously inaccessible classical problems in number modular functions or number 
theory. 


'T NP: abbreviation for “new page”, a concept invented by the Bourbaki group 
when France, multiplied its currency by hundred and introduced “new francs”. 
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